/tools/peakcalling/MMChIP-seq.xml

https://bitbucket.org/cistrome/cistrome-harvard/ · XML · 132 lines · 114 code · 18 blank · 0 comment · 0 complexity · b599657a6dd8fbd9fc9f8050d59adbd5 MD5 · raw file

  1. <tool name="MMChIP-seq" id="peakcalling_MMChIP-seq">
  2. <description>can combine ChIP-seq libraries from different batches with different fragment sizes.</description>
  3. <command interpreter="command">/bin/bash $shscript</command>
  4. <inputs>
  5. <conditional name="genome_size_cond">
  6. <param name="genome_size" type="select" label="mappable genome size">
  7. <option value="2770000000">Human (hg18)</option>
  8. <option value="2790000000">Human (hg19)</option>
  9. <option value="1870000000">Mouse (mm8)</option>
  10. <option value="1910000000">Mouse (mm9)</option>
  11. <option value="90300000">C elegans (ce4)</option>
  12. <option value="90300000">C elegans (ce6)</option>
  13. <option value="119000000">Drosophila (dm2)</option>
  14. <option value="152000000">Drosophila (dm3)</option>
  15. <option value="OTHER">Other</option>
  16. </param>
  17. <when value="OTHER">
  18. <param name="genome_size_other" type="text" label="Custom Genome Size"/>
  19. </when>
  20. <when value="2770000000"/>
  21. <when value="2790000000"/>
  22. <when value="1870000000"/>
  23. <when value="1910000000"/>
  24. <when value="90300000"/>
  25. <when value="90300000"/>
  26. <when value="119000000"/>
  27. <when value="152000000"/>
  28. </conditional>
  29. <param name="pvalue" type="float" label="p-value cutoff for peak detection" value="0.00001">
  30. <validator type="in_range" max="1" min="0" message="Pvalue is out of range, Pvalue has to be between 0 to 1" />
  31. </param>
  32. <repeat name="more" title="more bed file">
  33. <param format="bed" name="tfile" type="data" label="select treat file"/>
  34. <param format="bed" name="cfile" type="data" label="select control file" />
  35. <param name="frag" type="integer" label="fragment size" value="0" />
  36. </repeat>
  37. </inputs>
  38. <outputs>
  39. <data format="bed" name="peakfile" label="MMChIP-seq peaks output" />
  40. <data format="txt" name="xlsfile" label="MMChIP-seq xls output" />
  41. <data format="wig" name="wigfile" label="MMChIP-seq wig output" />
  42. <data format="txt" name="log" label="MMChIP-seq log" />
  43. </outputs>
  44. <configfiles>
  45. <configfile name="shscript">
  46. #!/bin/bash
  47. #import os
  48. #set $ad = chr(38)
  49. #set $gt = chr(62)
  50. #set $dollar = chr(36)
  51. #set $genomeSize = $genome_size_cond.genome_size
  52. #if $genome_size_cond.genome_size == "OTHER"
  53. #set $genomeSize = $genome_size_cond.genome_size_other
  54. #end if
  55. #set $files = ""
  56. #for $m in $more
  57. #set $path = os.path.abspath($__app__.config.tool_path)
  58. format=`$path/validation/fcfunc.py $m.tfile`
  59. if [[ ${dollar}format != "passed" ]]; then
  60. echo ${dollar}format ${gt}${ad}2
  61. exit;
  62. fi
  63. format=`$path/validation/fcfunc.py $m.cfile`
  64. if [[ ${dollar}format != "passed" ]]; then
  65. echo ${dollar}format ${gt}${ad}2
  66. exit;
  67. fi
  68. #set $temp = str($m.tfile)+","+str($m.cfile)+","+str($m.frag)
  69. #set $files = $files + " " + str($temp)
  70. #end for
  71. paths=`python $path/peakcalling/GetPath.py $files`
  72. echo ${dollar}paths
  73. MMChIP-seq ${dollar}paths $genomeSize $pvalue ${ad}${gt} $log
  74. mv MMChIP-seq_peaks.bed $peakfile
  75. mv MMChIP-seq_peaks.xls $xlsfile
  76. zcat MMChIP-seq_MACS_wiggle/treat/*.gz ${gt} $wigfile
  77. ###mv MMChIP-seq_peaks.wig $wigfile
  78. </configfile>
  79. </configfiles>
  80. <help>
  81. For ChIP-seq data, it is recommended to use MM-ChIP when inter-library
  82. size difference is comparatively large. Otherwise, one can simply uses
  83. MACS on the merged tag data (see more details in our paper, entitled
  84. "MM-ChIP enables integrative analysis of cross-platform and
  85. between-laboratory ChIP-chip or ChIP-seq data" published in Genome
  86. Biology.
  87. **TIP:** CAUTIONS: For ChIP-seq data,MM-ChIP was only tested on Illumina data
  88. because of the lack of public ChIP-seq datasets for the same protein of
  89. interest under similar biological conditions from technical platforms
  90. other than Ilumina. For cross-platform data, different statistical models
  91. may be needed to account for inter-platform variations besides variation
  92. in inter-library size (see the discussion part in our paper).
  93. This tool is for integrative analysis of ChIP-seq datasets. Original code is
  94. written by Yiwen Chen.
  95. .. class:: warningmark
  96. **NEED IMPROVEMENT**
  97. -----
  98. **Parameters**
  99. - **mappable genome size** Effective genome size. It can be 1.0e+9 or 1000000000
  100. - **P-VALUE** the p-value cutoff for peak detection
  101. - **Add More BED file** click the *Add new More BED files* to add
  102. more.
  103. - **select treat file** the name of mapped read files for ChIP samples.
  104. - **select control file** the name of the mapped read file for input samples.
  105. If there is no control sample, the 2nd column can be ommitted.
  106. - **fragment size** the fragment size(d) of individual datasets, which
  107. is estimated by MACS program.
  108. -----
  109. **Outputs**
  110. - *bed file* This file contains the peak information.
  111. - *xls file* This file contains peaks' detail information.
  112. - *wig file* This file contains the score for each region.
  113. </help>
  114. </tool>