PageRenderTime 18ms CodeModel.GetById 13ms app.highlight 2ms RepoModel.GetById 1ms app.codeStats 0ms

/tools/evolution/add_scores.xml

https://bitbucket.org/cistrome/cistrome-harvard/
XML | 106 lines | 80 code | 26 blank | 0 comment | 0 complexity | 3d77abe1d5f2c27529e9ea48746eeaf9 MD5 | raw file
  1<tool id="hgv_add_scores" name="phyloP" version="1.0.0">
  2  <description>interspecies conservation scores</description>
  3
  4  <command interpreter="python">
  5    add_scores.py "$input1" "$out_file1" "${GALAXY_DATA_INDEX_DIR}/add_scores.loc" "${input1.metadata.dbkey}" "${input1.metadata.chromCol}" "${input1.metadata.startCol}"
  6  </command>
  7
  8  <inputs>
  9    <param format="interval" name="input1" type="data" label="Dataset">
 10      <validator type="unspecified_build"/>
 11      <validator type="dataset_metadata_in_file" filename="add_scores.loc" metadata_name="dbkey" metadata_column="0" message="Data is currently not available for the specified build."/>
 12    </param>
 13  </inputs>
 14
 15  <outputs>
 16    <data format="input" name="out_file1" />
 17  </outputs>
 18
 19  <requirements>
 20    <requirement type="package">add_scores</requirement>
 21  </requirements>
 22
 23  <tests>
 24    <test>
 25      <param name="input1" value="add_scores_input1.interval" ftype="interval" dbkey="hg18" />
 26      <output name="output" file="add_scores_output1.interval" />
 27    </test>
 28    <test>
 29      <param name="input1" value="add_scores_input2.bed" ftype="interval" dbkey="hg18" />
 30      <output name="output" file="add_scores_output2.interval" />
 31    </test>
 32  </tests>
 33
 34  <help>
 35.. class:: warningmark
 36
 37This currently works only for builds hg18 and hg19.
 38
 39-----
 40
 41**Dataset formats**
 42
 43The input can be any interval_ format dataset.  The output is also in interval format.
 44(`Dataset missing?`_)
 45
 46.. _interval: ${static_path}/formatHelp.html#interval
 47.. _Dataset missing?: ${static_path}/formatHelp.html
 48
 49-----
 50
 51**What it does**
 52
 53This tool adds a column that measures interspecies conservation at each SNP 
 54position, using conservation scores for primates pre-computed by the 
 55phyloP program.  PhyloP performs an exact P-value computation under a 
 56continuous Markov substitution model. 
 57
 58The chromosome and start position
 59are used to look up the scores, so if a larger interval is in the input,
 60only the score for the first nucleotide is returned.
 61
 62-----
 63
 64**Example**
 65
 66- input file, with SNPs::
 67
 68    chr22  16440426  14440427  C/T
 69    chr22  15494851  14494852  A/G
 70    chr22  14494911  14494912  A/T
 71    chr22  14550435  14550436  A/G
 72    chr22  14611956  14611957  G/T
 73    chr22  14612076  14612077  A/G
 74    chr22  14668537  14668538  C
 75    chr22  14668703  14668704  A/T
 76    chr22  14668775  14668776  G
 77    chr22  14680074  14680075  A/T
 78    etc.
 79
 80- output file, showing conservation scores for primates::
 81
 82    chr22  16440426  14440427  C/T  0.509
 83    chr22  15494851  14494852  A/G  0.427
 84    chr22  14494911  14494912  A/T  NA
 85    chr22  14550435  14550436  A/G  NA
 86    chr22  14611956  14611957  G/T  -2.142
 87    chr22  14612076  14612077  A/G  0.369
 88    chr22  14668537  14668538  C    0.419
 89    chr22  14668703  14668704  A/T  -1.462
 90    chr22  14668775  14668776  G    0.470
 91    chr22  14680074  14680075  A/T  0.303
 92    etc.
 93
 94  "NA" means that the phyloP score was not available.
 95
 96-----
 97
 98**Reference**
 99
100Siepel A, Pollard KS, Haussler D. (2006)
101New methods for detecting lineage-specific selection.
102In Proceedings of the 10th International Conference on Research in Computational
103Molecular Biology (RECOMB 2006), pp. 190-205.
104
105  </help>
106</tool>