uniq.xml | searchcode

/tools/filters/uniq.xml

https://bitbucket.org/cistrome/cistrome-harvard/ · XML · 75 lines · 60 code · 15 blank · 0 comment · 0 complexity · 43305040d5cfb5822d42d4fa3addbb07 MD5 · raw file

<tool id="Count1" name="Count">
  <description>occurrences of each record</description>
  <command interpreter="python">uniq.py -i $input -o $out_file1 -c "$column" -d $delim</command>
  <inputs>
    <param name="input" type="data" format="tabular" label="from dataset" help="Dataset missing? See TIP below"/>
    <param name="column" type="data_column" data_ref="input" multiple="True" numerical="False" label="Count occurrences of values in column(s)" help="Multi-select list - hold the appropriate key while clicking to select multiple columns" />
    <param name="delim" type="select" label="Delimited by">
      <option value="T">Tab</option>
      <option value="Sp">Whitespace</option>
      <option value="Dt">Dot</option>
      <option value="C">Comma</option>
      <option value="D">Dash</option>
      <option value="U">Underscore</option>
      <option value="P">Pipe</option>
    </param>
  </inputs>
  <outputs>
    <data format="tabular" name="out_file1" />
  </outputs>
  <tests>
    <test>
      <param name="input" value="1.bed"/>
      <output name="out_file1" file="uniq_out.dat"/>
      <param name="column" value="1"/>
      <param name="delim" value="T"/>
    </test>
  </tests>
  <help>
  
 .. class:: infomark

**TIP:** If your data is not TAB delimited, use *Text Manipulation-&gt;Convert*

-----

**Syntax**

This tool counts occurrences of unique values in selected column(s).

- If multiple columns are selected, counting is performed on each unique group of all values in the selected columns.
- The first column of the resulting dataset will be the count of unique values in the selected column(s) and will be followed by each value.

-----

**Example**

- Input file::
     
       chr1   10  100  gene1
       chr1  105  200  gene2
       chr1  205  300  gene3
       chr2   10  100  gene4
       chr2 1000 1900  gene5
       chr3   15 1656  gene6
       chr4   10 1765  gene7
       chr4   10 1765  gene8

- Counting unique values in column c1 will result in::

       3 chr1
       2 chr2
       1 chr3
       2 chr4   

- Counting unique values in the grouping of columns c2 and c3 will result in::

       2    10    100
       2    10    1765
       1    1000  1900
       1    105   200
       1    15    1656
       1    205   300

</help>
</tool>