fastq_paired_end_splitter.py - This Python script splits a …

/tools/fastq/fastq_paired_end_splitter.py

https://bitbucket.org/cistrome/cistrome-harvard/ · Python · 33 lines · 27 code · 4 blank · 2 comment · 8 complexity · 64c7e8b5d571e91565b8701af9e9e281 MD5 · raw file


#Dan Blankenberg
import sys, os, shutil
from galaxy_utils.sequence.fastq import fastqReader, fastqWriter, fastqSplitter

def main():
    #Read command line arguments
    input_filename = sys.argv[1]
    input_type = sys.argv[2] or 'sanger'
    output1_filename = sys.argv[3]
    output2_filename = sys.argv[4]
    
    splitter = fastqSplitter()
    out1 = fastqWriter( open( output1_filename, 'wb' ), format = input_type )
    out2 = fastqWriter( open( output2_filename, 'wb' ), format = input_type )
    
    i = None
    skip_count = 0
    for i, fastq_read in enumerate( fastqReader( open( input_filename, 'rb' ), format = input_type ) ):
        read1, read2 = splitter.split( fastq_read )
        if read1 and read2:
            out1.write( read1 )
            out2.write( read2 )
        else:
            skip_count += 1
    out1.close()
    out2.close()
    if i is None:
        print "Your file contains no valid FASTQ reads."
    else:
        print 'Split %s of %s reads (%.2f%%).' % ( i - skip_count + 1, i + 1, float( i - skip_count + 1 ) / float( i + 1 ) * 100.0 )

if __name__ == "__main__":
    main()

Summary ✨

This Python script splits a FASTQ file into two files, one containing reads with an even number of bases and the other containing reads with an odd number of bases. The input file is specified as the first command line argument, and the output files are specified as the second and third command line arguments, respectively. The script uses the fastqReader class from the galaxy_utils library to read the FASTQ file and the fastqSplitter class to split the reads into two separate files based on their number of bases.

Tech Fingerprint

Standard Library: OS Interaction

Alerts (4)

'def' Ensure functions have docstrings for documentation
5
'open(' Use 'with open()' to ensure Files are properly closed
13 14 18