I’m currently involved in a project where we study, among other things, the 3′UTR and poly-A tails of certain genes. For this, is of course important to accurately identify the poly-A tail in each transcript, but I couldn’t find any program or tool to do just that. Presumably the task is considered too [...]
Posted on December 14th, 2009 by ketil
Filed under: EST analysis, Examples | 2 Comments »
Dephd is a small application for performing various analysis of nucleotide sequences. Originally, it was used for analyzing/converting PHD-file output from the basecaller phred, but it has since grown a bit beyond that. A new update was just pushed onto HackageDB, this is just a quick note describing new features.
Posted on June 16th, 2009 by ketil
Filed under: EST analysis, Uncategorized | No Comments »
I’ve been spending the last couple of weeks working on an indexing scheme for sequences, using Bryan O’Sullivan’s Bloom filters. Now, it turned out that when Bryan tested out the code, he found a curious problem: Apparently, the indexing stage scaled quadratically with sequence length. This wouldn’t have been so strange, were it not for [...]
Posted on October 24th, 2008 by ketil
Filed under: EST analysis, Examples, Optimization | No Comments »
It was just brought to my attention that people have started to use a new file format for sequences. This format, called ‘FastQ’ combines both the sequence data itself and the quality data in one file. That’s a nice idea, and I implemented support for it, tests, docs and all, in the bio library. Runs [...]
Posted on September 9th, 2008 by ketil
Filed under: EST analysis, Examples | No Comments »
Bloom filters is apparently a relatively old technology, dating from the 1970s or so, but it has somehow escaped my radar until Bryan O’Sullivan posted a message to the haskell mailing list announcing a high-performance implementation in Haskell, perhaps to support a chapter in the upcoming book. You can read all about Bloom filters on [...]
Posted on July 31st, 2008 by ketil
Filed under: EST analysis, Examples, Optimization | 1 Comment »
My current development project is an EST pipeline. For various reasons, it is implemented in shell — bash, to be exact. In other words, the pipeline is a script, or rather a set of scripts, that will tie together the various stages: masking, clustering, assembly, and annotation.
As in any program, there are many occasions where [...]
Posted on July 11th, 2008 by ketil
Filed under: EST analysis, Examples | 1 Comment »
The first challenge when dealing with sequence data is removing vector and contaminants and other undesirable stuff. I’ve been somewhat unhappy with the current state of my EST pipeline, and investigated more closely what is going on.
Posted on May 8th, 2008 by ketil
Filed under: EST analysis | No Comments »