/static/formatHelp.html

https://bitbucket.org/cistrome/cistrome-harvard/ · HTML · 665 lines · 596 code · 36 blank · 33 comment · 0 complexity · 474f0d9804a2769354e4cc782047766b MD5 · raw file

  1. <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
  2. "http://www.w3.org/TR/html4/loose.dtd">
  3. <html>
  4. <head>
  5. <title>Galaxy Data Formats</title>
  6. <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  7. <meta http-equiv="Content-Style-Type" content="text/css">
  8. <style type="text/css">
  9. hr { margin-top: 3ex; margin-bottom: 1ex; border: 1px inset }
  10. </style>
  11. </head>
  12. <body>
  13. <h2>Galaxy Data Formats</h2>
  14. <p>
  15. <br>
  16. <h3>Dataset missing?</h3>
  17. <p>
  18. If you have a dataset in your history that is not appearing in the
  19. drop-down selector for a tool, the most common reason is that it has
  20. the wrong format. Each Galaxy dataset has an associated file format
  21. recorded in its metadata, and tools will only list datasets from your
  22. history that have a format compatible with that particular tool. Of
  23. course some of these datasets might not actually contain relevant
  24. data, or even the correct columns needed by the tool, but filtering
  25. by format at least makes the list to select from a bit shorter.
  26. <p>
  27. Some of the formats are defined hierarchically, going from very
  28. general ones like <a href="#tab">Tabular</a> (which includes any text
  29. file with tab-separated columns), to more restrictive sub-formats
  30. like <a href="#interval">Interval</a> (where three of the columns
  31. must be the chromosome, start position, and end position), and on
  32. to even more specific ones such as <a href="#bed">BED</a> that have
  33. additional requirements. So for example if a tool's required input
  34. format is Tabular, then all of your history items whose format is
  35. recorded as Tabular will be listed, along with those in all
  36. sub-formats that also qualify as Tabular (Interval, BED, GFF, etc.).
  37. <p>
  38. There are two usual methods for changing a dataset's format in
  39. Galaxy: if the file contents are already in the required format but
  40. the metadata is wrong (perhaps because the Auto-detect feature of the
  41. Upload File tool guessed it incorrectly), you can fix the metadata
  42. manually by clicking on the pencil icon beside that dataset in your
  43. history. Or, if the file contents really are in a different format,
  44. Galaxy provides a number of format conversion tools (e.g. in the
  45. Text Manipulation and Convert Formats categories). For instance,
  46. if the tool you want to run requires Tabular but your columns are
  47. delimited by spaces or commas, you can use the "Convert delimiters
  48. to TAB" tool under Text Manipulation to reformat your data. However
  49. if your files are in a completely unsupported format, then you need
  50. to convert them yourself before uploading.
  51. <p>
  52. <hr>
  53. <h3>Format Descriptions</h3>
  54. <ul>
  55. <li><a href="#ab1">AB1</a>
  56. <li><a href="#axt">AXT</a>
  57. <li><a href="#bam">BAM</a>
  58. <li><a href="#bed">BED</a>
  59. <li><a href="#bedgraph">BedGraph</a>
  60. <li><a href="#binseq">Binseq.zip</a>
  61. <li><a href="#fasta">FASTA</a>
  62. <li><a href="#fastqsolexa">FastqSolexa</a>
  63. <li><a href="#fped">FPED</a>
  64. <li><a href="#gd_indivs">gd_indivs</a>
  65. <li><a href="#gd_ped">gd_ped</a>
  66. <li><a href="#gd_sap">gd_sap</a>
  67. <li><a href="#gd_snp">gd_snp</a>
  68. <li><a href="#gff">GFF</a>
  69. <li><a href="#gff3">GFF3</a>
  70. <li><a href="#gtf">GTF</a>
  71. <li><a href="#html">HTML</a>
  72. <li><a href="#interval">Interval</a>
  73. <li><a href="#lav">LAV</a>
  74. <li><a href="#lped">LPED</a>
  75. <li><a href="#maf">MAF</a>
  76. <li><a href="#mastervar">MasterVar</a>
  77. <li><a href="#pbed">PBED</a>
  78. <li><a href="#pgSnp">pgSnp</a>
  79. <li><a href="#psl">PSL</a>
  80. <li><a href="#scf">SCF</a>
  81. <li><a href="#sff">SFF</a>
  82. <li><a href="#table">Table</a>
  83. <li><a href="#tab">Tabular</a>
  84. <li><a href="#txtseqzip">Txtseq.zip</a>
  85. <li><a href="#vcf">VCF</a>
  86. <li><a href="#wig">Wiggle custom track</a>
  87. <li><a href="#text">Other text type</a>
  88. </ul>
  89. <p>
  90. <div><a name="ab1"></a></div>
  91. <hr>
  92. <strong>AB1</strong>
  93. <p>
  94. This is one of the ABIF family of binary sequence formats from
  95. Applied Biosystems Inc.
  96. <!-- Their PDF
  97. <a href="http://www.appliedbiosystems.com/support/software_community/ABIF_File_Format.pdf"
  98. >format specification</a> is unfortunately password-protected. -->
  99. Files should have a '<code>.ab1</code>' file extension. You must
  100. manually select this file format when uploading the file.
  101. <p>
  102. <div><a name="axt"></a></div>
  103. <hr>
  104. <strong>AXT</strong>
  105. <p>
  106. Used for pairwise alignment output from BLASTZ, after post-processing.
  107. Each alignment block contains three lines: a summary line and two
  108. sequence lines. Blocks are separated from one another by blank lines.
  109. The summary line contains chromosomal position and size information
  110. about the alignment, and consists of nine required fields.
  111. <a href="http://main.genome-browser.bx.psu.edu/goldenPath/help/axt.html"
  112. >More information</a>
  113. <!-- (not available on Main)
  114. <dl><dt>Can be converted to:
  115. <dd><ul>
  116. <li>FASTA<br>
  117. Convert Formats &rarr; AXT to FASTA
  118. <li>LAV<br>
  119. Convert Formats &rarr; AXT to LAV
  120. </ul></dl>
  121. -->
  122. <p>
  123. <div><a name="bam"></a></div>
  124. <hr>
  125. <strong>BAM</strong>
  126. <p>
  127. A binary alignment file compressed in the BGZF format with a
  128. '<code>.bam</code>' file extension.
  129. <!-- You must manually select this file format when uploading the file. -->
  130. <a href="http://samtools.sourceforge.net/SAM1.pdf">SAM</a>
  131. is the human-readable text version of this format.
  132. <dl><dt>Can be converted to:
  133. <dd><ul>
  134. <li>SAM<br>
  135. NGS: SAM Tools &rarr; BAM-to-SAM
  136. <li>Pileup<br>
  137. NGS: SAM Tools &rarr; Generate pileup
  138. <li>Interval<br>
  139. First convert to Pileup as above, then use
  140. NGS: SAM Tools &rarr; Pileup-to-Interval
  141. </ul></dl>
  142. <p>
  143. <div><a name="bed"></a></div>
  144. <hr>
  145. <strong>BED</strong>
  146. <p>
  147. <ul>
  148. <li> also qualifies as Tabular
  149. <li> also qualifies as Interval
  150. </ul>
  151. This tab-separated format describes a genomic interval, but has
  152. strict field specifications for use in genome browsers. BED files
  153. can have from 3 to 12 columns, but the order of the columns matters,
  154. and only the end ones can be omitted. Some groups of columns must
  155. be all present or all absent. As in Interval format (but unlike
  156. GFF and its relatives), the interval endpoints use a 0-based,
  157. half-open numbering system.
  158. <a href="http://main.genome-browser.bx.psu.edu/goldenPath/help/hgTracksHelp.html#BED"
  159. >Field specifications</a>
  160. <p>
  161. Example:
  162. <pre>
  163. chr22 1000 5000 cloneA 960 + 1000 5000 0 2 567,488, 0,3512
  164. chr22 2000 6000 cloneB 900 - 2000 6000 0 2 433,399, 0,3601
  165. </pre>
  166. <dl><dt>Can be converted to:
  167. <dd><ul>
  168. <li>GFF<br>
  169. Convert Formats &rarr; BED-to-GFF
  170. </ul></dl>
  171. <p>
  172. <div><a name="bedgraph"></a></div>
  173. <hr>
  174. <strong>BedGraph</strong>
  175. <p>
  176. <ul>
  177. <li> also qualifies as Tabular
  178. <li> also qualifies as Interval
  179. <li> also qualifies as BED
  180. </ul>
  181. <a href="http://main.genome-browser.bx.psu.edu/goldenPath/help/bedgraph.html"
  182. >BedGraph</a> is a BED file with the name column being a float value
  183. that is displayed as a wiggle score in tracks. Unlike in Wiggle
  184. format, the exact value of this score can be retrieved after being
  185. loaded as a track.
  186. <p>
  187. <div><a name="binseq"></a></div>
  188. <hr>
  189. <strong>Binseq.zip</strong>
  190. <p>
  191. A zipped archive consisting of binary sequence files in either AB1
  192. or SCF format. All files in this archive must have the same file
  193. extension which is one of '<code>.ab1</code>' or '<code>.scf</code>'.
  194. You must manually select this file format when uploading the file.
  195. <p>
  196. <div><a name="fasta"></a></div>
  197. <hr>
  198. <strong>FASTA</strong>
  199. <p>
  200. A sequence in
  201. <a href="http://www.ncbi.nlm.nih.gov/blast/fasta.shtml">FASTA</a>
  202. format consists of a single-line description, followed by lines of
  203. sequence data. The first character of the description line is a
  204. greater-than ('<code>&gt;</code>') symbol. All lines should be
  205. shorter than 80 characters.
  206. <pre>
  207. >sequence1
  208. atgcgtttgcgtgc
  209. gtcggtttcgttgc
  210. >sequence2
  211. tttcgtgcgtatag
  212. tggcgcggtga
  213. </pre>
  214. <dl><dt>Can be converted to:
  215. <dd><ul>
  216. <li>Tabular<br>
  217. Convert Formats &rarr; FASTA-to-Tabular
  218. </ul></dl>
  219. <p>
  220. <div><a name="fastqsolexa"></a></div>
  221. <hr>
  222. <strong>FastqSolexa</strong>
  223. <p>
  224. <a href="http://maq.sourceforge.net/fastq.shtml">FastqSolexa</a>
  225. is the Illumina (Solexa) variant of the FASTQ format, which stores
  226. sequences and quality scores in a single file.
  227. <pre>
  228. @seq1
  229. GACAGCTTGGTTTTTAGTGAGTTGTTCCTTTCTTT
  230. +seq1
  231. hhhhhhhhhhhhhhhhhhhhhhhhhhPW@hhhhhh
  232. @seq2
  233. GCAATGACGGCAGCAATAAACTCAACAGGTGCTGG
  234. +seq2
  235. hhhhhhhhhhhhhhYhhahhhhWhAhFhSIJGChO
  236. </pre>
  237. Or
  238. <pre>
  239. @seq1
  240. GAATTGATCAGGACATAGGACAACTGTAGGCACCAT
  241. +seq1
  242. 40 40 40 40 35 40 40 40 25 40 40 26 40 9 33 11 40 35 17 40 40 33 40 7 9 15 3 22 15 30 11 17 9 4 9 4
  243. @seq2
  244. GAGTTCTCGTCGCCTGTAGGCACCATCAATCGTATG
  245. +seq2
  246. 40 15 40 17 6 36 40 40 40 25 40 9 35 33 40 14 14 18 15 17 19 28 31 4 24 18 27 14 15 18 2 8 12 8 11 9
  247. </pre>
  248. <dl><dt>Can be converted to:
  249. <dd><ul>
  250. <li>FASTA<br>
  251. NGS: QC and manipulation &rarr; Generic FASTQ manipulation &rarr; FASTQ to FASTA
  252. <li>Tabular<br>
  253. NGS: QC and manipulation &rarr; Generic FASTQ manipulation &rarr; FASTQ to Tabular
  254. </ul></dl>
  255. <p>
  256. <div><a name="fped"></a></div>
  257. <hr>
  258. <strong>FPED</strong>
  259. <p>
  260. Also known as the FBAT format, for use with the
  261. <a href="http://biosun1.harvard.edu/~fbat/fbat.htm">FBAT</a> program.
  262. It consists of a pedigree file and a phenotype file.
  263. <p>
  264. <div><a name="gd_indivs"></a></div>
  265. <hr>
  266. <strong>ind</strong>
  267. <p>
  268. This format is a tabular file with the first column being the column number
  269. (1 based)
  270. from the gd_snp file where the individual/group starts. The second column is
  271. the label from the metadata for the individual/group. The third is an alias
  272. or blank.
  273. <p>
  274. <div><a name="gd_sap"></a></div>
  275. <hr>
  276. <strong>gd_sap</strong>
  277. <p>
  278. This is a tabular file describing single amino-acid polymorphisms (SAPs).
  279. You must manually select this file format when uploading the file.
  280. <!--
  281. <a href="http://www.bx.psu.edu/miller_lab/docs/formats/gd_sap_format.html"
  282. >Field specifications</a>
  283. -->
  284. <p>
  285. <div><a name="gd_snp"></a></div>
  286. <hr>
  287. <strong>gd_snp</strong>
  288. <p>
  289. This is a tabular file describing SNPs in individuals or populations.
  290. It contains the zero-based position of the SNP but not the range
  291. required by BED or interval so can not be used in Genomic Operations without
  292. adding an column for the end position.
  293. You must manually select this file format when uploading the file.
  294. <a href="http://www.bx.psu.edu/miller_lab/docs/formats/gd_snp_format.html"
  295. >Field specifications</a>
  296. <p>
  297. <div><a name="gff"></a></div>
  298. <hr>
  299. <strong>GFF</strong>
  300. <p>
  301. <ul>
  302. <li> also qualifies as Tabular
  303. </ul>
  304. GFF is a tab-separated format somewhat similar to BED, but it has
  305. different columns and is more flexible. There are
  306. <a href="http://main.genome-browser.bx.psu.edu/FAQ/FAQformat#format3"
  307. >nine required fields</a>.
  308. Note that unlike Interval and BED, GFF and its relatives (GFF3, GTF)
  309. use 1-based inclusive coordinates to specify genomic intervals.
  310. <dl><dt>Can be converted to:
  311. <dd><ul>
  312. <li>BED<br>
  313. Convert Formats &rarr; GFF-to-BED
  314. </ul></dl>
  315. <p>
  316. <div><a name="gff3"></a></div>
  317. <hr>
  318. <strong>GFF3</strong>
  319. <p>
  320. <ul>
  321. <li> also qualifies as Tabular
  322. </ul>
  323. The <a href="http://www.sequenceontology.org/gff3.shtml">GFF3</a>
  324. format addresses the most common extensions to GFF, while attempting
  325. to preserve compatibility with previous formats.
  326. Note that unlike Interval and BED, GFF and its relatives (GFF3, GTF)
  327. use 1-based inclusive coordinates to specify genomic intervals.
  328. <p>
  329. <div><a name="gtf"></a></div>
  330. <hr>
  331. <strong>GTF</strong>
  332. <p>
  333. <ul>
  334. <li> also qualifies as Tabular
  335. </ul>
  336. <a href="http://main.genome-browser.bx.psu.edu/FAQ/FAQformat#format4"
  337. >GTF</a> is a format for describing genes and other features associated
  338. with DNA, RNA, and protein sequences. It is a refinement to GFF that
  339. tightens the specification.
  340. Note that unlike Interval and BED, GFF and its relatives (GFF3, GTF)
  341. use 1-based inclusive coordinates to specify genomic intervals.
  342. <!-- (not available on Main)
  343. <dl><dt>Can be converted to:
  344. <dd><ul>
  345. <li>BedGraph<br>
  346. Convert Formats &rarr; GTF-to-BEDGraph
  347. </ul></dl>
  348. -->
  349. <p>
  350. <div><a name="html"></a></div>
  351. <hr>
  352. <strong>HTML</strong>
  353. <p>
  354. This format is an HTML web page. Click the eye icon next to the
  355. dataset to view it in your browser.
  356. <p>
  357. <div><a name="interval"></a></div>
  358. <hr>
  359. <strong>Interval</strong>
  360. <p>
  361. <ul>
  362. <li> also qualifies as Tabular
  363. </ul>
  364. This Galaxy format represents genomic intervals. It is tab-separated,
  365. but has the added requirement that three of the columns must be the
  366. chromosome name, start position, and end position, where the positions
  367. use a 0-based, half-open numbering system (see below). An optional
  368. strand column can also be specified, and an initial header row can
  369. be used to label the columns, which do not have to be in any special
  370. order. Arbitrary additional columns can also be present.
  371. <p>
  372. Required fields:
  373. <ul>
  374. <li>CHROM - The name of the chromosome (e.g. chr3, chrY, chr2_random)
  375. or contig (e.g. ctgY1).
  376. <li>START - The starting position of the feature in the chromosome or
  377. contig. The first base in a chromosome is numbered 0.
  378. <li>END - The ending position of the feature in the chromosome or
  379. contig. This base is not included in the feature. For example,
  380. the first 100 bases of a chromosome are described as START=0,
  381. END=100, and span the bases numbered 0-99.
  382. </ul>
  383. Optional:
  384. <ul>
  385. <li>STRAND - Defines the strand, either '<code>+</code>' or
  386. '<code>-</code>'.
  387. <li>Header row
  388. </ul>
  389. Example:
  390. <pre>
  391. #CHROM START END STRAND NAME COMMENT
  392. chr1 10 100 + exon myExon
  393. chrX 1000 10050 - gene myGene
  394. </pre>
  395. <dl><dt>Can be converted to:
  396. <dd><ul>
  397. <li>BED<br>
  398. The exact changes needed and tools to run will vary with what fields
  399. are in the Interval file and what type of BED you are converting to.
  400. In general you will likely use Text Manipulation &rarr; Compute, Cut,
  401. or Merge Columns.
  402. </ul></dl>
  403. <p>
  404. <div><a name="lav"></a></div>
  405. <hr>
  406. <strong>LAV</strong>
  407. <p>
  408. <a href="http://www.bx.psu.edu/miller_lab/dist/lav_format.html">LAV</a>
  409. is the raw pairwise alignment format that is output by BLASTZ. The
  410. first line begins with <code>#:lav</code>.
  411. <!-- (not available on Main)
  412. <dl><dt>Can be converted to:
  413. <dd><ul>
  414. <li>BED<br>
  415. Convert Formats &rarr; LAV to BED
  416. </ul></dl>
  417. -->
  418. <p>
  419. <div><a name="lped"></a></div>
  420. <hr>
  421. <strong>LPED</strong>
  422. <p>
  423. This is the linkage pedigree format, which consists of separate MAP and PED
  424. files. Together these files describe SNPs; the map file contains the position
  425. and an identifier for the SNP, while the pedigree file has the alleles. To
  426. upload this format into Galaxy, do not use Auto-detect for the file format;
  427. instead select <code>lped</code>. You will then be given two sections for
  428. uploading files, one for the pedigree file and one for the map file. For more
  429. information, see
  430. <a href="http://www.broadinstitute.org/science/programs/medical-and-population-genetics/haploview/input-file-formats-0"
  431. >linkage pedigree</a>,
  432. <a href="http://pngu.mgh.harvard.edu/~purcell/plink/data.shtml#map">MAP</a>,
  433. and/or <a href="http://pngu.mgh.harvard.edu/~purcell/plink/data.shtml#ped">PED</a>.
  434. <dl><dt>Can be converted to:
  435. <dd><ul>
  436. <li>PBED<br>Automatic
  437. <li>FPED<br>Automatic
  438. </ul></dl>
  439. <p>
  440. <div><a name="maf"></a></div>
  441. <hr>
  442. <strong>MAF</strong>
  443. <p>
  444. <a href="http://main.genome-browser.bx.psu.edu/FAQ/FAQformat#format5"
  445. >MAF</a> is the multi-sequence alignment format that is output by TBA
  446. and Multiz. The first line begins with '<code>##maf</code>'. This
  447. word is followed by whitespace-separated "variable<code>=</code>value"
  448. pairs. There should be no whitespace surrounding the '<code>=</code>'.
  449. <dl><dt>Can be converted to:
  450. <dd><ul>
  451. <li>BED<br>
  452. Convert Formats &rarr; MAF to BED
  453. <li>Interval<br>
  454. Convert Formats &rarr; MAF to Interval
  455. <li>FASTA<br>
  456. Convert Formats &rarr; MAF to FASTA
  457. </ul></dl>
  458. <p>
  459. <div><a name="mastervar"></a></div>
  460. <hr>
  461. <strong>MasterVar</strong>
  462. <p>
  463. MasterVar is a tab delimited text format with specified fields developed
  464. by the Complete Genomics life sciences company.
  465. <a href="http://media.completegenomics.com/documents/DataFileFormats_Standard_Pipeline_2.2.pdf"
  466. >Field specifications</a>.
  467. <dl><dt>Can be converted to:
  468. <dd><ul>
  469. <li>pgSnp<br>
  470. Convert Formats &rarr; MasterVar to pgSnp
  471. <li>gd_snp<br>
  472. Convert Formats &rarr; MasterVar to gd_snp
  473. </ul></dl>
  474. <p>
  475. <div><a name="pbed"></a></div>
  476. <hr>
  477. <strong>PBED</strong>
  478. <p>
  479. This is the binary version of the LPED format.
  480. <dl><dt>Can be converted to:
  481. <dd><ul>
  482. <li>LPED<br>Automatic
  483. </ul></dl>
  484. <p>
  485. <div><a name="pgSnp"></a></div>
  486. <hr>
  487. <strong>pgSnp</strong>
  488. <p>
  489. This is the personal genome SNP format used by UCSC. It is a BED-like
  490. format with columns chosen for the specialized display in the browser
  491. for personal genomes.
  492. <a href="http://genome.ucsc.edu/FAQ/FAQformat.html#format10"
  493. >Field specifications</a>.
  494. Galaxy treats it the same as an interval file.
  495. <p>
  496. <div><a name="psl"></a></div>
  497. <hr>
  498. <strong>PSL</strong>
  499. <p>
  500. <a href="http://main.genome-browser.bx.psu.edu/FAQ/FAQformat#format2">PSL</a>
  501. format is used for alignments returned by
  502. <a href="http://genome.ucsc.edu/cgi-bin/hgBlat?command=start">BLAT</a>.
  503. It does not include any sequence.
  504. <p>
  505. <div><a name="scf"></a></div>
  506. <hr>
  507. <strong>SCF</strong>
  508. <p>
  509. This is a binary sequence format originally designed for the Staden
  510. sequence handling software package. Files should have a
  511. '<code>.scf</code>' file extension. You must manually select this
  512. file format when uploading the file.
  513. <a href="http://staden.sourceforge.net/manual/formats_unix_2.html"
  514. >More information</a>
  515. <p>
  516. <div><a name="sff"></a></div>
  517. <hr>
  518. <strong>SFF</strong>
  519. <p>
  520. This is a binary sequence format used by the Roche 454 GS FLX
  521. sequencing machine, and is documented on p.&nbsp;528 of their
  522. <a href="http://sequence.otago.ac.nz/download/GS_FLX_Software_Manual.pdf"
  523. >software manual</a>. Files should have a '<code>.sff</code>' file
  524. extension.
  525. <!-- You must manually select this file format when uploading the file. -->
  526. <dl><dt>Can be converted to:
  527. <dd><ul>
  528. <li>FASTA<br>
  529. Convert Formats &rarr; SFF converter
  530. <li>FASTQ<br>
  531. Convert Formats &rarr; SFF converter
  532. </ul></dl>
  533. <p>
  534. <div><a name="table"></a></div>
  535. <hr>
  536. <strong>Table</strong>
  537. <p>
  538. Text data separated into columns by something other than tabs.
  539. <p>
  540. <div><a name="tab"></a></div>
  541. <hr>
  542. <strong>Tabular (tab-delimited)</strong>
  543. <p>
  544. One or more columns of text data separated by tabs.
  545. <dl><dt>Can be converted to:
  546. <dd><ul>
  547. <li>FASTA<br>
  548. Convert Formats &rarr; Tabular-to-FASTA<br>
  549. The Tabular file must have a title and sequence column.
  550. <li>FASTQ<br>
  551. NGS: QC and manipulation &rarr; Generic FASTQ manipulation &rarr; Tabular to FASTQ
  552. <li>Interval<br>
  553. If the Tabular file has a chromosome column (or is all on one
  554. chromosome) and has a position column, you can create an Interval
  555. file (e.g. for SNPs). If it is all on one chromosome, use
  556. Text Manipulation &rarr; Add column to add a CHROM column.
  557. If the given position is 1-based, use
  558. Text Manipulation &rarr; Compute with the position column minus 1 to
  559. get the START, and use the original given column for the END.
  560. If the given position is 0-based, use it as the START, and compute
  561. that plus 1 to get the END.
  562. </ul></dl>
  563. <p>
  564. <div><a name="txtseqzip"></a></div>
  565. <hr>
  566. <strong>Txtseq.zip</strong>
  567. <p>
  568. A zipped archive consisting of flat text sequence files. All files
  569. in this archive must have the same file extension of
  570. '<code>.txt</code>'. You must manually select this file format when
  571. uploading the file.
  572. <p>
  573. <div><a name="vcf"></a></div>
  574. <hr>
  575. <strong>VCF</strong>
  576. <p>
  577. Variant Call Format (VCF) is a tab delimited text file with specified
  578. fields. It was developed by the 1000 Genomes Project.
  579. <a href="http://www.1000genomes.org/wiki/Analysis/Variant%20Call%20Format/vcf-variant-call-format-version-41"
  580. >Field specifications</a>.
  581. <dl><dt>Can be converted to:
  582. <dd><ul>
  583. <li>pgSnp<br>
  584. Convert Formats &rarr; VCF to pgSnp
  585. </ul></dl>
  586. <p>
  587. <div><a name="wig"></a></div>
  588. <hr>
  589. <strong>Wiggle custom track</strong>
  590. <p>
  591. Wiggle tracks are typically used to display per-nucleotide scores
  592. in a genome browser. The Wiggle format for custom tracks is
  593. line-oriented, and the wiggle data is preceded by a track definition
  594. line that specifies which of three different types is being used.
  595. <a href="http://main.genome-browser.bx.psu.edu/goldenPath/help/wiggle.html"
  596. >More information</a>
  597. <dl><dt>Can be converted to:
  598. <dd><ul>
  599. <li>Interval<br>
  600. Get Genomic Scores &rarr; Wiggle-to-Interval
  601. <li>As a second step this could be converted to 3- or 4-column BED,
  602. by removing extra columns using
  603. Text Manipulation &rarr; Cut columns from a table.
  604. </ul></dl>
  605. <p>
  606. <div><a name="gd_ped"></a></div>
  607. <hr>
  608. <strong>gd_ped</strong>
  609. <p>
  610. Similar to the linkage pedigree format (lped).
  611. <p>
  612. <div><a name="text"></a></div>
  613. <hr>
  614. <strong>Other text type</strong>
  615. <p>
  616. Any text file.
  617. <dl><dt>Can be converted to:
  618. <dd><ul>
  619. <li>Tabular<br>
  620. If the text has fields separated by spaces, commas, or some other
  621. delimiter, it can be converted to Tabular by using
  622. Text Manipulation &rarr; Convert delimiters to TAB.
  623. </ul></dl>
  624. <p>
  625. <!-- blank lines so internal links will jump farther to end -->
  626. <br><br><br><br><br><br><br><br><br><br><br><br>
  627. <br><br><br><br><br><br><br><br><br><br><br><br>
  628. </body>
  629. </html>