PageRenderTime 32ms CodeModel.GetById 18ms RepoModel.GetById 1ms app.codeStats 0ms

/tools/regVariation/featureCounter.xml

https://bitbucket.org/cistrome/cistrome-harvard/
XML | 75 lines | 58 code | 17 blank | 0 comment | 0 complexity | 914f22ecd7bc7ab39829b026586de2f7 MD5 | raw file
  1. <tool id="featureCoverage1" name="Feature coverage" version="2.0.0">
  2. <description></description>
  3. <command interpreter="python">featureCounter.py $input1 $input2 $output -1 ${input1.metadata.chromCol},${input1.metadata.startCol},${input1.metadata.endCol},${input1.metadata.strandCol} -2 ${input2.metadata.chromCol},${input2.metadata.startCol},${input2.metadata.endCol},${input2.metadata.strandCol}</command>
  4. <inputs>
  5. <param format="interval" name="input1" type="data" help="First dataset">
  6. <label>What portion of</label>
  7. </param>
  8. <param format="interval" name="input2" type="data" help="Second dataset">
  9. <label>is covered by</label>
  10. </param>
  11. </inputs>
  12. <outputs>
  13. <data format="interval" name="output" metadata_source="input1" />
  14. </outputs>
  15. <tests>
  16. <test>
  17. <param name="input1" value="1.bed" />
  18. <param name="input2" value="2.bed" />
  19. <output name="output" file="6_feature_coverage.bed" />
  20. </test>
  21. <test>
  22. <param name="input1" value="chrY1.bed" />
  23. <param name="input2" value="chrY2.bed" />
  24. <output name="output" file="chrY_Coverage.bed" />
  25. </test>
  26. </tests>
  27. <help>
  28. .. class:: infomark
  29. **What it does**
  30. This tool finds the coverage of intervals in the first dataset on intervals in the second dataset. The coverage and count are appended as 4 new columns in the resulting dataset.
  31. -----
  32. **Example**
  33. - If **First dataset** consists of the following windows::
  34. chrX 1 10001 seg 0 -
  35. chrX 10001 20001 seg 0 -
  36. chrX 20001 30001 seg 0 -
  37. chrX 30001 40001 seg 0 -
  38. - and **Second dataset** consists of the following exons::
  39. chrX 5000 6000 seg2 0 -
  40. chrX 5500 7000 seg2 0 -
  41. chrX 9000 22000 seg2 0 -
  42. chrX 24000 34000 seg2 0 -
  43. chrX 36000 38000 seg2 0 -
  44. - the **Result** is the coverage of exons of the second dataset in each of the windows contained in first dataset::
  45. chrX 1 10001 seg 0 - 3001 0.3001 2 1
  46. chrX 10001 20001 seg 0 - 10000 1.0 1 0
  47. chrX 20001 30001 seg 0 - 8000 0.8 0 2
  48. chrX 30001 40001 seg 0 - 5999 0.5999 1 1
  49. - To clarify, the following line of output ( added columns are indexed by a, b and c )::
  50. a b c d
  51. chrX 1 10001 seg 0 - 3001 0.3001 2 1
  52. implies that 2 exons (c) fall fully in this window (chrX:1-10001), 1 exon (d) partially overlaps this window, and these 3 exons cover 30.01% (c) of the window size, spanning 3001 nucleotides (a).
  53. * a: number of nucleotides in this window covered by the features in (c) and (d) - features overlapping with each other will be merged to calculate (a)
  54. * b: fraction of window size covered by features in (c) and (d) - features overlapping with each other will be merged to calculate (b)
  55. * c: number of features in the 2nd dataset that fall **completely** within this window
  56. * d: number of features in the 2nd dataset that **partially** overlap this window
  57. </help>
  58. </tool>