PageRenderTime 25ms CodeModel.GetById 15ms app.highlight 4ms RepoModel.GetById 1ms app.codeStats 0ms

/tools/regVariation/categorize_elements_satisfying_criteria.xml

https://bitbucket.org/cistrome/cistrome-harvard/
XML | 78 lines | 57 code | 21 blank | 0 comment | 0 complexity | 5ff7ae14a22117c01dea3f5528a815b5 MD5 | raw file
 1<tool id="categorize_elements_satisfying_criteria" name="Categorize Elements" version="1.0.0">
 2  <description>satisfying criteria</description>
 3  
 4  <command interpreter="perl">
 5  	categorize_elements_satisfying_criteria.pl $inputFile1 $inputFile2 $outputFile1
 6  </command>
 7
 8  <inputs>
 9  	<param format="tabular" name="inputFile1" type="data" label="Select file containing categories and their elements"/>
10  	<param format="tabular" name="inputFile2" type="data" label="Select file containing criteria and elements data"/>
11  </inputs>
12  
13  <outputs>
14    <data format="tabular" name="outputFile1"/>
15  </outputs>
16
17  <tests>
18  	<test>
19  		<param name="inputFile1" value="categories.tabular" ftype="tabular" />
20  		<param name="inputFile2" value="criteria_elements_data.tabular" ftype="tabular" />
21    	<output name="outputFile1" file="categorized_elements.tabular" />
22  	</test>
23  </tests>
24  
25  	
26  <help> 
27
28.. class:: infomark
29
30**What it does**
31
32The program takes as input a set of categories, such that each category contains many elements. It also takes a table relating elements with criteria, such that each element is assigned a number representing the number of times the element satisfies a certain criterion. 
33
34- The first input is a TABULAR format file, such that the left column represents the names of categories and, all other columns represent the names of elements in each category.
35- The second input is a TABULAR format file relating elements with criteria, such that the first line represents the names of criteria and the left column represents the names of elements.
36- The output is a TABULAR format file relating catergories with criteria, such that each categoy is assigned a number representing the total number of times its elements satisfies a certain criterion.. Each category is assigned as many numbers as criteria.
37
38
39**Example**
40
41Let the first input file be a group of motif categories as follows::
42
43	Deletion_Hotspots		deletionHoptspot1		deletionHoptspot2		deletionHoptspot3	
44	Dna_Pol_Pause_Frameshift	dnaPolPauseFrameshift1		dnaPolPauseFrameshift2		dnaPolPauseFrameshift3		dnaPolPauseFrameshift4
45	Indel_Hotspots			indelHotspot1			
46	Insertion_Hotspots		insertionHotspot1		insertionHotspot2		
47	Topoisomerase_Cleavage_Sites	topoisomeraseCleavageSite1	topoisomeraseCleavageSite2	topoisomeraseCleavageSite3	
48
49
50And let the second input file represent the number of times each motif occurs in a certain window size of indel flanking regions, as follows::
51
52					10bp	20bp	40bp	
53	deletionHoptspot1		1	1	2
54	deletionHoptspot2		1	1	1
55	deletionHoptspot3		0	0	0
56	dnaPolPauseFrameshift1		1	1	1
57	dnaPolPauseFrameshift2		0	2	1
58	dnaPolPauseFrameshift3		0	0	0
59	dnaPolPauseFrameshift4		0	1	2
60	indelHotspot1			0	0	0
61	insertionHotspot1		0	0	1
62	insertionHotspot2		1	1	1
63	topoisomeraseCleavageSite1	1	1	1
64	topoisomeraseCleavageSite2	1	2	1
65	topoisomeraseCleavageSite3	0	0	2
66
67Running the program will give the total number of times the motifs of each category occur in every window size of indel flanking regions::
68
69					10bp	20bp	40bp
70	Deletion_Hotspots		2	2	3
71	Dna_Pol_Pause_Frameshift	1	4	4
72	Indel_Hotspots			0	0	0
73	Insertion_Hotspots		1	1	2
74	Topoisomerase_Cleavage_Sites	2	3	4
75
76    </help> 
77    
78</tool>