PageRenderTime 13ms CodeModel.GetById 10ms app.highlight 1ms RepoModel.GetById 1ms app.codeStats 0ms

/tools/samtools/sam_bitwise_flag_filter.xml

https://bitbucket.org/cistrome/cistrome-harvard/
XML | 97 lines | 82 code | 15 blank | 0 comment | 0 complexity | dcb8d832570a2712b380253e5dcbb809 MD5 | raw file
 1<tool id="sam_bw_filter" name="Filter SAM" version="1.0.0">
 2  <description>on bitwise flag values</description>
 3  <parallelism method="basic"></parallelism>
 4  <command interpreter="python">
 5    sam_bitwise_flag_filter.py  
 6      --input_sam_file=$input1
 7      --flag_column=2
 8      #for $bit in $bits
 9       '${bit.flags}=${bit.states}'
10      #end for
11      > $out_file1
12  </command>
13  <inputs>
14    <param format="sam" name="input1" type="data" label="Select dataset to filter"/>
15    <repeat name="bits" title="Flag">
16      <param name="flags" type="select" label="Type">
17        <option value="--0x0001">Read is paired</option>
18        <option value="--0x0002">Read is mapped in a proper pair</option>
19        <option value="--0x0004">The read is unmapped</option>
20        <option value="--0x0008">The mate is unmapped</option>
21        <option value="--0x0010">Read strand</option>
22        <option value="--0x0020">Mate strand</option>
23        <option value="--0x0040">Read is the first in a pair</option>
24        <option value="--0x0080">Read is the second in a pair</option>
25        <option value="--0x0100">The alignment or this read is not primary</option>
26        <option value="--0x0200">The read fails platform/vendor quality checks</option>
27        <option value="--0x0400">The read is a PCR or optical duplicate</option>
28      </param>
29      <param name="states" type="select" display="radio" label="Set the states for this flag">
30         <option value="0">No</option>
31         <option value="1">Yes</option>
32       </param>
33    </repeat>
34  </inputs>
35  <outputs>
36    <data format="sam" name="out_file1" />
37  </outputs>
38  <tests>
39    <test>
40      <param name="input1" value="sam_bw_filter.sam" ftype="sam"/>
41      <param name="flags" value="Read is mapped in a proper pair"/>
42      <param name="states" value="1"/>
43      <output name="out_file1" file="sam_bw_filter_0002-yes.sam" ftype="sam"/>
44    </test>
45  </tests>
46  <help>
47
48**What it does**
49
50Allows parsing of SAM datasets using bitwise flag (the second column). The bits in the flag are defined as follows::
51
52    Bit Info
53 ------ --------------------------------------------------------------------------   
54 0x0001 the read is paired in sequencing, no matter whether it is mapped in a pair 
55 0x0002 the read is mapped in a proper pair (depends on the protocol, normally 
56        inferred during alignment) 1 
57 0x0004 the query sequence itself is unmapped 
58 0x0008 the mate is unmapped 1 
59 0x0010 strand of the query (0 for forward; 1 for reverse strand) 
60 0x0020 strand of the mate 1 
61 0x0040 the read is the first read in a pair (see below)
62 0x0080 the read is the second read in a pair (see below) 
63 0x0100 the alignment is not primary (a read having split hits may 
64        have multiple primary alignment records) 
65 0x0200 the read fails platform/vendor quality checks 
66 0x0400 the read is either a PCR duplicate or an optical duplicate
67
68Note the following:
69
70- Flag 0x02, 0x08, 0x20, 0x40 and 0x80 are only meaningful when flag 0x01 is present. 
71- If in a read pair the information on which read is the first in the pair is lost in the upstream analysis, flag 0x01 should be set, while 0x40 and 0x80 should both be zero.
72
73-----
74
75**Example**
76
77Suppose the following dataset was generated with BWA mapper::
78
79 r001 163 ref  7 30 8M2I4M1D3M = 37  39 TTAGATAAAGGATACTA *
80 r002   0 ref  9 30 3S6M1P1I4M *  0   0 AAAAGATAAGGATA    *
81 r003   0 ref  9 30       5H6M *  0   0 AGCTAA            * NM:i:1
82 r004   0 ref 16 30    6M14N5M *  0   0 ATAGCTTCAGC       *
83 r003  16 ref 29 30       6H5M *  0   0 TAGGC             * NM:i:0
84 r001  83 ref 37 30         9M =  7 -39 CAGCGCCAT         *
85
86To select properly mapped pairs, click the **Add new Flag** button and set *Read mapped in a proper pair* to **Yes**. The following two reads will be returned::
87
88 r001 163 ref  7 30 8M2I4M1D3M = 37  39 TTAGATAAAGGATACTA *
89 r001  83 ref 37 30         9M =  7 -39 CAGCGCCAT         *
90
91For more information, please consult the `SAM format description`__.
92
93.. __: http://www.ncbi.nlm.nih.gov/pubmed/19505943
94
95
96  </help>
97</tool>