PageRenderTime 18ms CodeModel.GetById 12ms app.highlight 2ms RepoModel.GetById 1ms app.codeStats 0ms

/tools/rgenetics/rgCaCo.xml

https://bitbucket.org/cistrome/cistrome-harvard/
XML | 103 lines | 77 code | 26 blank | 0 comment | 0 complexity | 07588cb89d02eb3b0266a51d3e5cac99 MD5 | raw file
  1<tool id="rgCaCo1" name="Case Control:">
  2    <description>for unrelated subjects</description>
  3    <command interpreter="python">
  4        rgCaCo.py '$i.extra_files_path/$i.metadata.base_name' "$title"  '$out_file1' '$logf' '$logf.files_path' '$gffout'
  5    </command>
  6    <inputs>
  7      <param name="i"  type="data" label="RGenetics genotype data from your current history"
  8      format="pbed" />
  9       <param name='title' type='text' size="132" value='CaseControl' label="Title for this job"/>
 10
 11    </inputs>
 12
 13   <outputs>
 14       <data format="tabular" name="out_file1" label="${title}_rgCaCo.xls" />
 15       <data format="txt" name="logf" label="${title}_rgCaCo.log"/>
 16       <data format="gff" name="gffout" label="${title}_rgCaCoTop.gff" />
 17   </outputs>
 18<tests>
 19 <test>
 20 <param name='i' value='tinywga' ftype='pbed' >
 21   <metadata name='base_name' value='tinywga' />
 22   <composite_data value='tinywga.bim' />
 23   <composite_data value='tinywga.bed' />
 24   <composite_data value='tinywga.fam' />
 25   <edit_attributes type='name' value='tinywga' /> 
 26 </param>
 27 <param name='title' value='rgCaCotest1' />
 28 <output name='out_file1' file='rgCaCotest1_CaCo.xls' ftype='tabular' compare='diff' />
 29 <output name='logf' file='rgCaCotest1_CaCo_log.txt' ftype='txt' compare='diff' lines_diff='20' />
 30 <output name='gffout' file='rgCaCotest1_CaCo_topTable.gff' ftype='gff' compare='diff' />
 31 </test>
 32</tests>
 33<help>
 34
 35.. class:: infomark
 36
 37**Syntax**
 38
 39- **Genotype file** is the input case control data chosen from available library Plink binary files
 40- **Map file** is the linkage format .map file corresponding to the genotypes in the Genotype file
 41- **Type of test** is the kind of test statistic to report such as Armitage trend test or genotype test
 42- **Format** determines how your data will be returned to your Galaxy workspace
 43
 44-----
 45
 46**Summary**
 47
 48This tool will perform some standard statistical tests comparing subjects designated as
 49affected (cases) and unaffected subjects (controls). To avoid bias, it is important that
 50controls who had been affected would have been eligible for sampling as cases. This may seem
 51odd, but it requires that the cases and controls are drawn from the same sampling frame.
 52
 53The armitage trend test is robust to departure from HWE and so very attractive - after all, a real disease
 54mutation may well result in distorted HWE at least in cases. All the others are susceptible to
 55bias in the presence of HWE departures.
 56
 57All of these tests are exquisitely sensitive to non-differential population stratification in cases
 58compared to controls and this must be tested before believing any results here. Use the PCA method for
 59100k markers or more.
 60
 61If you don't see the genotype data set you want here, it can be imported using one of the methods available from
 62the Galaxy Get Data tool page.
 63
 64Output format can be UCSC .bed if you want to see your
 65results as a fully fledged UCSC track. A map file containing the chromosome and offset for each marker is required for
 66writing this kind of output.
 67Alternatively you can use .gg for the UCSC Genome Graphs tool which has all of the advantages
 68of the the .bed track, plus a neat, visual front end that displays a lot of useful clues.
 69Either of these are a very useful way of quickly getting a look
 70at your data in full genomic context.
 71
 72Finally, if you can't live without
 73spreadsheet data, choose the .xls tab delimited format. It's not a stupid binary excel file. Just a plain old tab delimited
 74one with a header. Fortunately excel is dumb enough to open these without much protest.
 75
 76
 77-----
 78
 79.. class:: infomark
 80
 81**Attribution**
 82
 83This Galaxy tool relies on Plink (see Plinksrc_) to test Casae Control association models. 
 84
 85So, we rely on the author (Shaun Purcell) for the documentation you need specific to those settings - they are very nicely documented - see
 86DOC_
 87
 88Tool and Galaxy datatypes originally designed and written for the Rgenetics
 89series of whole genome scale statistical genetics tools by ross lazarus (ross.lazarus@gmail.com)
 90
 91Copyright Ross Lazarus March 2007
 92This Galaxy wrapper is released licensed under the LGPL_ but is about as useful as a chocolate teapot without Plink which is GPL.
 93
 94I'm no lawyer, but it looks like you got GPL if you use this software. Good luck.
 95
 96.. _Plinksrc: http://pngu.mgh.harvard.edu/~purcell/plink/ 
 97
 98.. _LGPL: http://www.gnu.org/copyleft/lesser.html
 99
100.. _DOC: http://pngu.mgh.harvard.edu/~purcell/plink/anal.shtml#cc
101
102</help>
103</tool>