Dephd updates
Dephd is a small application for performing various analysis of nucleotide sequences. Originally, it was used for analyzing/converting PHD-file output from the basecaller phred, but it has since grown a bit beyond that. A new update was just pushed onto HackageDB, this is just a quick note describing new features.
Filtering out empty sequences.
Phred often produces zero-length sequences, and this confuses other programs. While BLAST will just output a warning, SeqClean — or to be precise, cln2qual — will break down. (My own code using the Bioinformatics library treats all sequences the same regardless of length, so zero-length sequences are perfectly okay). Anyway, you can now use dephd -z to eliminate them from the output.
Sequence Clipping
Sequence trimming or clipping is often necessary to remove contamination like vector sequence, or simply low quality sequence parts. Typically, both of these occur at the ends of the sequences. Many programs (including dephd, but also phred, lucy, seqclean and others) add trimming information to the sequence header. Dephd is now able to act on this information and clip the sequences. The trimming information is now obsolete as the coordinates have changed, so they are replaced with the clipping coordinates. Dephd also provides its own quality assessment, and with the -q option, sequence ends where the sliding windown average quality is below 15 will be clipped. This is pretty heavy-handed, but it seems I get better EST clustering with this enabled.
Old features
Of course we retain the old features: reads PHD and Fasta/Qual files, mask (to lower case/N but don’t clip) by quality, generate quality plots, outputs Fasta/Qual, and ranking sequences by quality.
Edit: In the latest release, there’s now also a fix for a problem with drawing quality graphs with gnuplot. It turns out that my shiny new Ubuntu ships with gnuplot 4.2, but your crappy old distribution ships with an older version, and that there are some incompatibilities in the input formats. I’ve now reverted this to use old-style format only, so hopefully it should work with gnuplots back to 3.7 or so. And for those SLS or MCC interrim die-hards out there, I’ll even add an option to dump the gnuplot file itself, so that you can copy it to a floppy and generate the plots on a computer with a modern color display. How’s that for user friendly?
Leave a Reply
You must be logged in to post a comment.