PageRenderTime 44ms CodeModel.GetById 35ms app.highlight 4ms RepoModel.GetById 1ms app.codeStats 0ms

/tools/regVariation/compute_motif_frequencies_for_all_motifs.xml

https://bitbucket.org/cistrome/cistrome-harvard/
XML | 72 lines | 51 code | 21 blank | 0 comment | 0 complexity | d42a4d78625ea3624fc79f8588e7483f MD5 | raw file
 1<tool id="compute_motif_frequencies_for_all_motifs" name="Compute Motif Frequencies For All Motifs" version="1.0.0">
 2  <description>motif by motif</description>
 3  
 4  <command interpreter="perl">
 5  	compute_motif_frequencies_for_all_motifs.pl $inputFile1 $inputFile2 $inputWindowSize3 $outputFile1
 6  </command>
 7
 8  <inputs>
 9  	<param format="tabular" name="inputFile1" type="data" label="Select the motifs file"/>
10  	<param format="tabular" name="inputFile2" type="data" label="Select the indel flanking sequences windows file"/>
11    <param type="integer" name="inputWindowSize3" size="6" value="0" label="What is the number of 10bp windows in which the motif frequencies will be computed?" help="'0' = one window only"/>
12  </inputs>
13  
14  <outputs>
15    <data format="tabular" name="outputFile1"/>
16  </outputs>
17
18  <tests>
19  	<test>
20  		<param name="inputFile1" value="motifs2.tabular" />
21  		<param name="inputFile2" value="flankingSequencesWindows10_2.tabular" />
22    	<param name="inputWindowSize3" value="0" />
23    	<output name="outputFile1" file="motifFrequencies_every_indels0.tabular" />
24  	</test>
25  	
26  	<test>
27  		<param name="inputFile1" value="motifs2.tabular" />
28  		<param name="inputFile2" value="flankingSequencesWindows10_2.tabular" />
29    	<param name="inputWindowSize3" value="4" />
30    	<output name="outputFile1" file="motifFrequencies_every_indels4.tabular" /> 
31  	</test>
32  </tests>
33
34  <help> 
35
36.. class:: infomark
37
38**What it does**
39
40This program computes the frequencies of each motif at a window size, determined by the user, in both upstream and downstream sequences flanking indels in all chromosomes.
41
42- The first input is a TABULAR format file containing the motif names and sequences, one line per motif, such that the file consists of two columns: 
43
44 - The left column represents the motif names
45 - The right column represents the motif sequence, as follows::
46 
47 	dnaPolPauseFrameshift1	GAG
48	dnaPolPauseFrameshift2	ACG
49	xSites1			CCG
50
51- The second input is a TABULAR format file representing the windows of both upstream  and downstream flanking sequences. It consists of multiple left columns representing the windows of the upstream flanking sequences, followed by one column representing the indels, then followed by multiple right columns representing the windows of the downstream flanking sequences, as follows::
52
53	cgaggtcagg	agatcgagac	catcctggct	aacatggtga	aatcccgtct	ctactaaaaa	indel	aaatttatat	ttataaacaa	ttttaataca	cctatgttta	ttatacattt
54	GCCAGTTTAT	GGTCTAACAA	GGAGAGAAAC	AGGGGGCTGA	AGGGGTTTCT	TAACCTCCAG	indel	TTCCGGGCTC	TGTCCCTAAC	CCCCAGCTAG	GTAAGTGGCA	AAGCACTTCT
55	CAGTGGGACC	AAGCACTGAA	CCACTTTGGG	GAGAATCTCA	CACTGGGGCC	CTCTGACACC	indel	tatatatttt	tttttttttt	tttttttttt	tttttttttg	agatggtgtc
56	AGAGCAGCAG	CACCCACTTT	TGCAGTGTGT	GACGTTGGTG	GAGCCATCGA	AGTCTGTGCT	indel	GAGCCCTCCC	CAGTGCTCCG	AGGAGCTGCT	GTTCCCCCTG	GAGCTCAGAA
57
58- The third input is an integer number representing the number of windows to be considered starting from the indel and leftward for the upstream flanking sequence and, starting from the indel and rightward for the downstream flanking sequence.
59
60- The output is a TABULAR format file consisting of three columns: 
61
62 - The left column represents the motif name
63 - The middle column represents the motif frequency in the specified windows of the upstream sequence flanking an indel
64 - The right column represents the motif frequency in the specified windows of the downstream sequence flanking an indel
65 
66 There is line per indel in the output file, such that the total number of lines in the output file = number of motifs x number of indels.
67
68Note: The number of windows entered by the user must be a positive integer >= 1. if negative integer or 0 is entered by the user, the program will consider it as 1.
69	
70  </help>  
71  
72</tool>