I recently did a brief presentation of the set of tools I’ve developed for analyzing pyrosequences (the Roche 454 variety). Nothing spectacular, just an overview of various ways of slicing and dicing SFF files using tools written in Haskell. For lack of a better place to put it, I’ll drop my slides below.
flowers
Posted on February 19th, 2010 by ketil
Filed under: Downloads, Examples | No Comments »
I’m currently involved in a project where we study, among other things, the 3′UTR and poly-A tails of certain genes. For this, is of course important to accurately identify the poly-A tail in each transcript, but I couldn’t find any program or tool to do just that. Presumably the task is considered too [...]
Posted on December 14th, 2009 by ketil
Filed under: EST analysis, Examples | 2 Comments »
Ubuntu 9.10, nicknamed Karmic Koala, is about to be released, and in a moment of idleness, I upgraded my old 9.04 install to the latest beta. Upgrading is always generates a slight feeling of dread, taking the plunge from the cozy stability of bugs I’ve learned to work around, into the great unknown, but it [...]
Posted on October 7th, 2009 by ketil
Filed under: Examples | No Comments »
I was recently in Trondheim, and got an opportunity to present Haskell to an audience of bioinformaticians. Alas, it is hard to describe Haskell in all its glory to the uninitiated in forty-five minutes, and especially when I also wanted to talk a bit about the application to bioinformatics. I left in the belief that [...]
Posted on September 15th, 2009 by ketil
Filed under: Downloads, Examples | No Comments »
A recurring theme on the Haskell mailing lists is how to quicly parse a file consisting of integers. Often, this comes up in the contest of benchmarking, but a real example of integer-filled files are the quality data that often accompanies Fasta sequence files. When investigating one of my programs that seemed a bit on [...]
Posted on August 31st, 2009 by ketil
Filed under: Examples, Optimization, Uncategorized | No Comments »
Roche’s 454 sequencing technology can produce biological sequence data on a scale that exceeds traditional Sanger sequencing by orders of magnitude. Due to the fundamentally different method used to generate the sequences, we would like to investigate the raw data and see if we can quantify — and maybe also reduce the number or severity [...]
Posted on November 14th, 2008 by ketil
Filed under: Examples, Optimization | 3 Comments »
I’ve been spending the last couple of weeks working on an indexing scheme for sequences, using Bryan O’Sullivan’s Bloom filters. Now, it turned out that when Bryan tested out the code, he found a curious problem: Apparently, the indexing stage scaled quadratically with sequence length. This wouldn’t have been so strange, were it not for [...]
Posted on October 24th, 2008 by ketil
Filed under: EST analysis, Examples, Optimization | No Comments »
It was just brought to my attention that people have started to use a new file format for sequences. This format, called ‘FastQ’ combines both the sequence data itself and the quality data in one file. That’s a nice idea, and I implemented support for it, tests, docs and all, in the bio library. Runs [...]
Posted on September 9th, 2008 by ketil
Filed under: EST analysis, Examples | No Comments »
Bloom filters is apparently a relatively old technology, dating from the 1970s or so, but it has somehow escaped my radar until Bryan O’Sullivan posted a message to the haskell mailing list announcing a high-performance implementation in Haskell, perhaps to support a chapter in the upcoming book. You can read all about Bloom filters on [...]
Posted on July 31st, 2008 by ketil
Filed under: EST analysis, Examples, Optimization | 1 Comment »
My current development project is an EST pipeline. For various reasons, it is implemented in shell — bash, to be exact. In other words, the pipeline is a script, or rather a set of scripts, that will tie together the various stages: masking, clustering, assembly, and annotation.
As in any program, there are many occasions where [...]
Posted on July 11th, 2008 by ketil
Filed under: EST analysis, Examples | 1 Comment »