PageRenderTime 34ms CodeModel.GetById 25ms app.highlight 3ms RepoModel.GetById 2ms app.codeStats 0ms

/tools/expression/go_analysis.xml

https://bitbucket.org/cistrome/cistrome-harvard/
XML | 102 lines | 80 code | 21 blank | 1 comment | 0 complexity | a26789037496eaa9c3e95c94bfef3bdf MD5 | raw file
  1<?xml version="1.0"?>
  2
  3<tool name="Conduct GO" id="goId" force_history_refresh="True">
  4  <description>
  5     Given a list of genes, using Bioconductor (GO, GOstats) and  DAVID at NIH
  6  </description>
  7  <code file="go_analysis_code.py"/>
  8  <command interpreter="python">
  9  go_analysis.py  '$title' '$diff_expr_file' '$logmeta' '$diff_expr_file.dbkey', '$annotation'
 10  </command>
 11
 12  <inputs>
 13    <param name="title" label="Title to label the new output file" type="text" size="80" value="Conduct GO" />
 14    <param name="diff_expr_file" type="data" format="txt" label="Target Gene List"
 15	   optional="false" size="120" help="Choose a target gene list from your history (make sure in this file there is a column called 'Gene' for gene Entrez IDs or the file only contains a single column for Entrez IDs) "/>
 16    <param name="annotation" type="select" label="Gene Universe">
 17      <option value="hgu133a" selected="True">Homo sapiens hgu133a</option>
 18      <option value="hgu133b">Homo sapiens hgu133b</option>
 19      <option value="hgu133plus2">Homo sapiens hgu133plus</option>
 20      <option value="hgu95av2">Homo sapiens hgu95av2</option>
 21      <option value="mouse430a2">Mouse 430a2</option>
 22      <option value="celegans">C. elengans</option>
 23      <!--<option value="fly.db0">Fly</option>-->
 24      <option value="drosophila2">Drosophila</option>
 25      <option value="org.Hs.eg">org.Hs.eg</option>
 26      <option value="org.Mm.eg">org.Mm.eg</option>
 27      <option value="org.Ce.eg">org.Ce.eg</option>
 28      <option value="org.Dm.eg">org.Dm.eg</option>
 29    </param>         
 30  </inputs>
 31
 32  <outputs>     
 33    <data format='txt' name="logmeta"/>
 34  </outputs>
 35  <help>
 36  
 37
 38**Syntax**
 39
 40- **Title:** is used to name the output files - so make it meaningful
 41- **Target Gene List:** Choose a target gene list from your history
 42- **Gene Universe:** Select a gene universe
 43
 44-----
 45
 46**Summary**
 47
 48For a list of input genes, this tool uses R/BioC packages (GO, GOstats)  to
 49identify over represented GO terms.  The number of input genes that can be associated
 50with the GO term are compared to the number of genes from the gene universe that can
 51be associated with the specific GO term. The gene universe should be defined as the
 52list of genes that were used to identify differentially expressed genes (the input genes).
 53This gene universe can be either the collection of all genes that can be detected with
 54the microarray used in the analysis, or the list of genes that passed a non-specific
 55pre-filtering in an analysis for the identification of differentially expressed genes.
 56This tool also allows to perform  GO analysis using DAVID (http://david.abcc.ncifcrf.gov).
 57
 58The input list of target genes, (Entrez Gene ID) is typically obtained as result of the use of the
 59"Calculate differential expression" tool and the format is as follow:
 60
 61::
 62
 63    Probe       Symbol  Description     Gene    Cytoband    Log2Ratio   PValue
 64    20042_at    SOD1    superoxide..    6647    21q22.1     0.838191    0.008021
 65    200818_at   ATP50   ATP synthase..  539     21q22.1-q   0.711812    0.006348
 66    201123_s_at EIF5A   eukayotic..     1984    17p13-p12   -1.80077    0.008021
 67
 68Any gene list with  Entrez ID can be used as input for this tool, you can load the list into your history using the
 69"Upload File from your computer" tool, the tool will look for a column called "Gene". The following is a valid example:
 70
 71::
 72
 73	Gene
 74	351
 75	6647
 76	3337
 77	754
 78	6612
 79	539
 80	1984
 81	1471
 82	5445
 83	8209
 84	522
 85	
 86
 87The output will be 3 different text files:
 88
 89- Cellular Component ontology: GO_CC_Result.txt
 90- Biological Process ontology: GO_BP_Result.txt
 91- Molecular Function ontology: GO_MF_Result.txt
 92
 93The column "Gene"  (EntrezIDs) will be used to map the significantly over represented GO
 94terms for the particular GO analysis, GO-terms reported will be sorted according to their
 95significance (p-value).
 96
 97After the GO analysis is conducted using uses R/BioC packages (GO, GOstats), the original
 98target gene list  is also send to DAVID (http://david.abcc.ncifcrf.gov) for comparative
 99analysis.
100
101  </help>
102</tool>