Wu, J., Anczukow, O., Krainer, A. R., Zhang, M. Q., Zhang, C. (2013) OLego: fast and sensitive mapping of spliced mRNA-Seq reads using small seeds. Nucleic Acids Research, 41 (10). pp. 5149-5163. ISSN 03051048 (ISSN)
Preview |
PDF (Paper)
Krainer Nucleic Acids Research 2013.pdf - Published Version Download (1MB) | Preview |
Abstract
A crucial step in analyzing mRNA-Seq data is to accurately and efficiently map hundreds of millions of reads to the reference genome and exon junctions. Here we present OLego, an algorithm specifically designed for de novo mapping of spliced mRNA-Seq reads. OLego adopts a multiple-seed-and-extend scheme, and does not rely on a separate external aligner. It achieves high sensitivity of junction detection by strategic searches with small seeds ( approximately 14 nt for mammalian genomes). To improve accuracy and resolve ambiguous mapping at junctions, OLego uses a built-in statistical model to score exon junctions by splice-site strength and intron size. Burrows-Wheeler transform is used in multiple steps of the algorithm to efficiently map seeds, locate junctions and identify small exons. OLego is implemented in C++ with fully multithreaded execution, and allows fast processing of large-scale data. We systematically evaluated the performance of OLego in comparison with published tools using both simulated and real data. OLego demonstrated better sensitivity, higher or comparable accuracy and substantially improved speed. OLego also identified hundreds of novel micro-exons (<30 nt) in the mouse transcriptome, many of which are phylogenetically conserved and can be validated experimentally in vivo. OLego is freely available at http://zhanglab.c2b2.columbia.edu/index.php/OLego.
Item Type: | Paper |
---|---|
Subjects: | bioinformatics bioinformatics > genomics and proteomics bioinformatics > genomics and proteomics > annotation > map annotation bioinformatics > genomics and proteomics > computers > computer software |
CSHL Authors: | |
Communities: | CSHL Post Doctoral Fellows CSHL labs > Krainer lab CSHL Cancer Center Program > Gene Regulation and Cell Proliferation |
Depositing User: | Matt Covey |
Date: | 2013 |
Date Deposited: | 22 May 2013 19:31 |
Last Modified: | 13 Oct 2015 18:40 |
PMCID: | PMC3664805 |
Related URLs: | |
URI: | https://repository.cshl.edu/id/eprint/28311 |
Actions (login required)
Administrator's edit/view item |