/tools/next_gen_conversion/solid2fastq.xml

https://bitbucket.org/cistrome/cistrome-harvard/ · XML · 154 lines · 127 code · 27 blank · 0 comment · 0 complexity · 8e3b93323d16a5250b0d8f19c13b820e MD5 · raw file

  1. <tool id="solid2fastq" name="Convert">
  2. <description>SOLiD output to fastq</description>
  3. <command interpreter="python">
  4. #if $is_run.paired == "no" #solid2fastq.py --fr=$input1 --fq=$input2 --fout=$out_file1 -q $qual $trim_name $trim_first_base $double_encode
  5. #elif $is_run.paired == "yes" #solid2fastq.py --fr=$input1 --fq=$input2 --fout=$out_file1 --rr=$input3 --rq=$input4 --rout=$out_file2 -q $qual $trim_name $trim_first_base $double_encode
  6. #end if#
  7. </command>
  8. <inputs>
  9. <param name="input1" type="data" format="csfasta" label="Select reads"/>
  10. <param name="input2" type="data" format="qualsolid" label="Select qualities"/>
  11. <conditional name="is_run">
  12. <param name="paired" type="select" label="Is this a mate-pair run?">
  13. <option value="no" selected="true">No</option>
  14. <option value="yes">Yes</option>
  15. </param>
  16. <when value="yes">
  17. <param name="input3" type="data" format="csfasta" label="Select Reverse reads"/>
  18. <param name="input4" type="data" format="qualsolid" label="Select Reverse qualities"/>
  19. </when>
  20. <when value="no">
  21. </when>
  22. </conditional>
  23. <param name="qual" label="Remove reads containing color qualities below this value" type="integer" value="0"/>
  24. <param name="trim_name" type="select" label="Trim trailing &quot;_F3&quot; and &quot;_R3&quot; ?">
  25. <option value="-t" selected="true">Yes</option>
  26. <option value="">No</option>
  27. </param>
  28. <param name="trim_first_base" type="select" label="Trim first base?">
  29. <option value="-f">Yes (BWA)</option>
  30. <option value="" selected="true">No (bowtie)</option>
  31. </param>
  32. <param name="double_encode" type="select" label="Double encode?">
  33. <option value="-d">Yes (BWA)</option>
  34. <option value="" selected="true">No (bowtie)</option>
  35. </param>
  36. </inputs>
  37. <outputs>
  38. <data format="fastqcssanger" name="out_file1"/>
  39. <data format="fastqcssanger" name="out_file2">
  40. <filter>is_run['paired'] == 'yes'</filter>
  41. </data>
  42. </outputs>
  43. <tests>
  44. <test>
  45. <param name="input1" value="fr.csfasta" ftype="csfasta"/>
  46. <param name="input2" value="fr.qualsolid" ftype="qualsolid" />
  47. <param name="paired" value="no"/>
  48. <param name="qual" value="0" />
  49. <param name="trim_first_base" value="No" />
  50. <param name="trim_name" value="No" />
  51. <param name="double_encode" value="No"/>
  52. <output name="out_file1" file="solid2fastq_out_1.fastq"/>
  53. </test>
  54. <test>
  55. <param name="input1" value="fr.csfasta" ftype="csfasta"/>
  56. <param name="input2" value="fr.qualsolid" ftype="qualsolid" />
  57. <param name="paired" value="yes"/>
  58. <param name="input3" value="rr.csfasta" ftype="csfasta"/>
  59. <param name="input4" value="rr.qualsolid" ftype="qualsolid" />
  60. <param name="qual" value="0" />
  61. <param name="trim_first_base" value="No" />
  62. <param name="trim_name" value="Yes" />
  63. <param name="double_encode" value="No"/>
  64. <output name="out_file1" file="solid2fastq_out_2.fastq"/>
  65. <output name="out_file2" file="solid2fastq_out_3.fastq"/>
  66. </test>
  67. </tests>
  68. <help>
  69. **What it does**
  70. Converts output of SOLiD instrument (versions 3.5 and earlier) to fastq format suitable for bowtie, bwa, and PerM mappers.
  71. --------
  72. **Input datasets**
  73. Below are examples of forward (F3) reads and quality scores:
  74. Reads::
  75. &gt;1831_573_1004_F3
  76. T00030133312212111300011021310132222
  77. &gt;1831_573_1567_F3
  78. T03330322230322112131010221102122113
  79. Quality scores::
  80. &gt;1831_573_1004_F3
  81. 4 29 34 34 32 32 24 24 20 17 10 34 29 20 34 13 30 34 22 24 11 28 19 17 34 17 24 17 25 34 7 24 14 12 22
  82. &gt;1831_573_1567_F3
  83. 8 26 31 31 16 22 30 31 28 29 22 30 30 31 32 23 30 28 28 31 19 32 30 32 19 8 32 10 13 6 32 10 6 16 11
  84. **Mate pairs**
  85. If your data is from a mate-paired run, you will have additional read and quality datasets that will look similar to the ones above with one exception: the names of reads will be ending with "_R3".
  86. In this case choose **Yes** from the *Is this a mate-pair run?* drop down and you will be able to select R reads. When processing mate pairs this tool generates two output files: one for F3 reads and the other for R3 reads.
  87. The reads are guaranteed to be paired -- mated reads will be in the same position in F3 and R3 fastq file. However, because pairing is verified it may take a while to process an entire SOLiD run (several hours).
  88. ------
  89. **Explanation of parameters**
  90. **Remove reads containing color qualities below this value** - any read that contains as least one color call with quality lower than the specified value **will not** be reported.
  91. **Trim trailing "_F3" and "_R3"?** - does just that. Not necessary for bowtie. Required for BWA.
  92. **Trim first base?** - SOLiD reads contain an adapter base such as the first T in this read::
  93. &gt;1831_573_1004_F3
  94. T00030133312212111300011021310132222
  95. this option removes this base leaving only color calls. Not necessary for bowtie. Required for BWA.
  96. **Double encode?** - converts color calls (0123.) to pseudo-nucleotides (ACGTN). Not necessary for bowtie. Required for BWA.
  97. ------
  98. **Examples of output**
  99. When all parameters are left "as-is" you will get this (using reads and qualities shown above)::
  100. @1831_573_1004
  101. T00030133312212111300011021310132222
  102. +
  103. %%&gt;CCAA9952+C&gt;5C.?C79,=42C292:C(9/-7
  104. @1831_573_1004
  105. T03330322230322112131010221102122113
  106. +
  107. );@@17?@=&gt;7??@A8?==@4A?A4)A+.'A+'1,
  108. Setting *Trim first base from reads* to **Yes** will produce this::
  109. @1831_573_1004
  110. 00030133312212111300011021310132222
  111. +
  112. %%&gt;CCAA9952+C&gt;5C.?C79,=42C292:C(9/-7
  113. @1831_573_1004
  114. 03330322230322112131010221102122113
  115. +
  116. );@@17?@=&gt;7??@A8?==@4A?A4)A+.'A+'1,
  117. Finally, setting *Double encode* to **Yes** will yield::
  118. @1831_573_1004
  119. TAAATACTTTCGGCGCCCTAAACCAGCTCACTGGGG
  120. +
  121. %%&gt;CCAA9952+C&gt;5C.?C79,=42C292:C(9/-7
  122. @1831_573_1004
  123. TATTTATGGGTATGGCCGCTCACAGGCCAGCGGCCT
  124. +
  125. );@@17?@=&gt;7??@A8?==@4A?A4)A+.'A+'1,
  126. </help>
  127. </tool>