/tools/regVariation/categorize_elements_satisfying_criteria.xml
https://bitbucket.org/cistrome/cistrome-harvard/ · XML · 78 lines · 57 code · 21 blank · 0 comment · 0 complexity · 5ff7ae14a22117c01dea3f5528a815b5 MD5 · raw file
- <tool id="categorize_elements_satisfying_criteria" name="Categorize Elements" version="1.0.0">
- <description>satisfying criteria</description>
-
- <command interpreter="perl">
- categorize_elements_satisfying_criteria.pl $inputFile1 $inputFile2 $outputFile1
- </command>
- <inputs>
- <param format="tabular" name="inputFile1" type="data" label="Select file containing categories and their elements"/>
- <param format="tabular" name="inputFile2" type="data" label="Select file containing criteria and elements data"/>
- </inputs>
-
- <outputs>
- <data format="tabular" name="outputFile1"/>
- </outputs>
- <tests>
- <test>
- <param name="inputFile1" value="categories.tabular" ftype="tabular" />
- <param name="inputFile2" value="criteria_elements_data.tabular" ftype="tabular" />
- <output name="outputFile1" file="categorized_elements.tabular" />
- </test>
- </tests>
-
-
- <help>
- .. class:: infomark
- **What it does**
- The program takes as input a set of categories, such that each category contains many elements. It also takes a table relating elements with criteria, such that each element is assigned a number representing the number of times the element satisfies a certain criterion.
- - The first input is a TABULAR format file, such that the left column represents the names of categories and, all other columns represent the names of elements in each category.
- - The second input is a TABULAR format file relating elements with criteria, such that the first line represents the names of criteria and the left column represents the names of elements.
- - The output is a TABULAR format file relating catergories with criteria, such that each categoy is assigned a number representing the total number of times its elements satisfies a certain criterion.. Each category is assigned as many numbers as criteria.
- **Example**
- Let the first input file be a group of motif categories as follows::
- Deletion_Hotspots deletionHoptspot1 deletionHoptspot2 deletionHoptspot3
- Dna_Pol_Pause_Frameshift dnaPolPauseFrameshift1 dnaPolPauseFrameshift2 dnaPolPauseFrameshift3 dnaPolPauseFrameshift4
- Indel_Hotspots indelHotspot1
- Insertion_Hotspots insertionHotspot1 insertionHotspot2
- Topoisomerase_Cleavage_Sites topoisomeraseCleavageSite1 topoisomeraseCleavageSite2 topoisomeraseCleavageSite3
- And let the second input file represent the number of times each motif occurs in a certain window size of indel flanking regions, as follows::
- 10bp 20bp 40bp
- deletionHoptspot1 1 1 2
- deletionHoptspot2 1 1 1
- deletionHoptspot3 0 0 0
- dnaPolPauseFrameshift1 1 1 1
- dnaPolPauseFrameshift2 0 2 1
- dnaPolPauseFrameshift3 0 0 0
- dnaPolPauseFrameshift4 0 1 2
- indelHotspot1 0 0 0
- insertionHotspot1 0 0 1
- insertionHotspot2 1 1 1
- topoisomeraseCleavageSite1 1 1 1
- topoisomeraseCleavageSite2 1 2 1
- topoisomeraseCleavageSite3 0 0 2
- Running the program will give the total number of times the motifs of each category occur in every window size of indel flanking regions::
- 10bp 20bp 40bp
- Deletion_Hotspots 2 2 3
- Dna_Pol_Pause_Frameshift 1 4 4
- Indel_Hotspots 0 0 0
- Insertion_Hotspots 1 1 2
- Topoisomerase_Cleavage_Sites 2 3 4
- </help>
-
- </tool>