PageRenderTime 42ms CodeModel.GetById 35ms RepoModel.GetById 0ms app.codeStats 0ms

/tools/samtools/sam_bitwise_flag_filter.xml

https://bitbucket.org/cistrome/cistrome-harvard/
XML | 97 lines | 82 code | 15 blank | 0 comment | 0 complexity | dcb8d832570a2712b380253e5dcbb809 MD5 | raw file
  1. <tool id="sam_bw_filter" name="Filter SAM" version="1.0.0">
  2. <description>on bitwise flag values</description>
  3. <parallelism method="basic"></parallelism>
  4. <command interpreter="python">
  5. sam_bitwise_flag_filter.py
  6. --input_sam_file=$input1
  7. --flag_column=2
  8. #for $bit in $bits
  9. '${bit.flags}=${bit.states}'
  10. #end for
  11. > $out_file1
  12. </command>
  13. <inputs>
  14. <param format="sam" name="input1" type="data" label="Select dataset to filter"/>
  15. <repeat name="bits" title="Flag">
  16. <param name="flags" type="select" label="Type">
  17. <option value="--0x0001">Read is paired</option>
  18. <option value="--0x0002">Read is mapped in a proper pair</option>
  19. <option value="--0x0004">The read is unmapped</option>
  20. <option value="--0x0008">The mate is unmapped</option>
  21. <option value="--0x0010">Read strand</option>
  22. <option value="--0x0020">Mate strand</option>
  23. <option value="--0x0040">Read is the first in a pair</option>
  24. <option value="--0x0080">Read is the second in a pair</option>
  25. <option value="--0x0100">The alignment or this read is not primary</option>
  26. <option value="--0x0200">The read fails platform/vendor quality checks</option>
  27. <option value="--0x0400">The read is a PCR or optical duplicate</option>
  28. </param>
  29. <param name="states" type="select" display="radio" label="Set the states for this flag">
  30. <option value="0">No</option>
  31. <option value="1">Yes</option>
  32. </param>
  33. </repeat>
  34. </inputs>
  35. <outputs>
  36. <data format="sam" name="out_file1" />
  37. </outputs>
  38. <tests>
  39. <test>
  40. <param name="input1" value="sam_bw_filter.sam" ftype="sam"/>
  41. <param name="flags" value="Read is mapped in a proper pair"/>
  42. <param name="states" value="1"/>
  43. <output name="out_file1" file="sam_bw_filter_0002-yes.sam" ftype="sam"/>
  44. </test>
  45. </tests>
  46. <help>
  47. **What it does**
  48. Allows parsing of SAM datasets using bitwise flag (the second column). The bits in the flag are defined as follows::
  49. Bit Info
  50. ------ --------------------------------------------------------------------------
  51. 0x0001 the read is paired in sequencing, no matter whether it is mapped in a pair
  52. 0x0002 the read is mapped in a proper pair (depends on the protocol, normally
  53. inferred during alignment) 1
  54. 0x0004 the query sequence itself is unmapped
  55. 0x0008 the mate is unmapped 1
  56. 0x0010 strand of the query (0 for forward; 1 for reverse strand)
  57. 0x0020 strand of the mate 1
  58. 0x0040 the read is the first read in a pair (see below)
  59. 0x0080 the read is the second read in a pair (see below)
  60. 0x0100 the alignment is not primary (a read having split hits may
  61. have multiple primary alignment records)
  62. 0x0200 the read fails platform/vendor quality checks
  63. 0x0400 the read is either a PCR duplicate or an optical duplicate
  64. Note the following:
  65. - Flag 0x02, 0x08, 0x20, 0x40 and 0x80 are only meaningful when flag 0x01 is present.
  66. - If in a read pair the information on which read is the first in the pair is lost in the upstream analysis, flag 0x01 should be set, while 0x40 and 0x80 should both be zero.
  67. -----
  68. **Example**
  69. Suppose the following dataset was generated with BWA mapper::
  70. r001 163 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTA *
  71. r002 0 ref 9 30 3S6M1P1I4M * 0 0 AAAAGATAAGGATA *
  72. r003 0 ref 9 30 5H6M * 0 0 AGCTAA * NM:i:1
  73. r004 0 ref 16 30 6M14N5M * 0 0 ATAGCTTCAGC *
  74. r003 16 ref 29 30 6H5M * 0 0 TAGGC * NM:i:0
  75. r001 83 ref 37 30 9M = 7 -39 CAGCGCCAT *
  76. To select properly mapped pairs, click the **Add new Flag** button and set *Read mapped in a proper pair* to **Yes**. The following two reads will be returned::
  77. r001 163 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTA *
  78. r001 83 ref 37 30 9M = 7 -39 CAGCGCCAT *
  79. For more information, please consult the `SAM format description`__.
  80. .. __: http://www.ncbi.nlm.nih.gov/pubmed/19505943
  81. </help>
  82. </tool>