/tools/fastx_toolkit/fastx_barcode_splitter.xml
https://bitbucket.org/cistrome/cistrome-harvard/ · XML · 76 lines · 51 code · 23 blank · 2 comment · 0 complexity · 7fb0c68cb722818be2cc80376181f06d MD5 · raw file
- <tool id="cshl_fastx_barcode_splitter" name="Barcode Splitter">
- <description></description>
- <requirements><requirement type="package">fastx_toolkit</requirement></requirements>
- <command interpreter="bash">fastx_barcode_splitter_galaxy_wrapper.sh $BARCODE $input "$input.name" "$output.files_path" --mismatches $mismatches --partial $partial $EOL > $output </command>
- <inputs>
- <param format="txt" name="BARCODE" type="data" label="Barcodes to use" />
- <param format="fasta,fastqsanger,fastqsolexa,fastqillumina" name="input" type="data" label="Library to split" />
- <param name="EOL" type="select" label="Barcodes found at">
- <option value="--bol">Start of sequence (5' end)</option>
- <option value="--eol">End of sequence (3' end)</option>
- </param>
- <param name="mismatches" type="integer" size="3" value="2" label="Number of allowed mismatches" />
-
- <param name="partial" type="integer" size="3" value="0" label="Number of allowed barcodes nucleotide deletions" />
-
- </inputs>
-
- <tests>
- <test>
- <!-- Split a FASTQ file -->
- <param name="BARCODE" value="fastx_barcode_splitter1.txt" />
- <param name="input" value="fastx_barcode_splitter1.fastq" ftype="fastqsolexa" />
- <param name="EOL" value="Start of sequence (5' end)" />
- <param name="mismatches" value="2" />
- <param name="partial" value="0" />
- <output name="output" file="fastx_barcode_splitter1.out" />
- </test>
- </tests>
- <outputs>
- <data format="html" name="output" />
- </outputs>
- <help>
- **What it does**
- This tool splits a Solexa library (FASTQ file) or a regular FASTA file into several files, using barcodes as the split criteria.
- --------
- **Barcode file Format**
- Barcode files are simple text files.
- Each line should contain an identifier (descriptive name for the barcode), and the barcode itself (A/C/G/T), separated by a TAB character.
- Example::
- #This line is a comment (starts with a 'number' sign)
- BC1 GATCT
- BC2 ATCGT
- BC3 GTGAT
- BC4 TGTCT
-
- For each barcode, a new FASTQ file will be created (with the barcode's identifier as part of the file name).
- Sequences matching the barcode will be stored in the appropriate file.
- One additional FASTQ file will be created (the 'unmatched' file), where sequences not matching any barcode will be stored.
- The output of this tool is an HTML file, displaying the split counts and the file locations.
- **Output Example**
- .. image:: ./static/fastx_icons/barcode_splitter_output_example.png
- ------
- This tool is based on `FASTX-toolkit`__ by Assaf Gordon.
- .. __: http://hannonlab.cshl.edu/fastx_toolkit/
-
- </help>
- </tool>
- <!-- FASTX-barcode-splitter is part of the FASTX-toolkit, by A.Gordon (gordon@cshl.edu) -->