Assembly complexity of prokaryotic genomes using short reads

Kingsford, C., Schatz, M. C., Pop, M. (2010) Assembly complexity of prokaryotic genomes using short reads. BMC Bioinformatics, 11. ISSN 14712105 (ISSN)

[thumbnail of Paper]
Preview
PDF (Paper)
Schatz Genome Biology 2010.pdf - Published Version

Download (554kB) | Preview
URL: http://www.ncbi.nlm.nih.gov/pubmed/20064276
DOI: 10.1186/1471-2105-11-21

Abstract

Background: De Bruijn graphs are a theoretical framework underlying several modern genome assembly programs, especially those that deal with very short reads. We describe an application of de Bruijn graphs to analyze the global repeat structure of prokaryotic genomes.Results: We provide the first survey of the repeat structure of a large number of genomes. The analysis gives an upper-bound on the performance of genome assemblers for de novo reconstruction of genomes across a wide range of read lengths. Further, we demonstrate that the majority of genes in prokaryotic genomes can be reconstructed uniquely using very short reads even if the genomes themselves cannot. The non-reconstructible genes are overwhelmingly related to mobile elements (transposons, IS elements, and prophages).Conclusions: Our results improve upon previous studies on the feasibility of assembly with short reads and provide a comprehensive benchmark against which to compare the performance of the short-read assemblers currently being developed. © 2010 Kingsford et al; licensee BioMed Central Ltd.

Item Type: Paper
Additional Information:
Uncontrolled Keywords: Prokaryota article biology DNA sequence genetic database genetics genome methodology prokaryotic cell prophage transposon Computational Biology Databases, Genetic DNA Transposable Elements Prokaryotic Cells Prophages Sequence Analysis DNA
Subjects: bioinformatics > genomics and proteomics > annotation > sequence annotation
bioinformatics > genomics and proteomics > analysis and processing > Sequence Data Processing
bioinformatics > genomics and proteomics > Mapping and Rendering > Sequence Rendering
CSHL Authors:
Communities: CSHL labs > Schatz lab
Depositing User: CSHL Librarian
Date: 2010
Date Deposited: 16 Mar 2012 14:59
Last Modified: 15 Mar 2013 18:58
PMCID: PMC2821320
Related URLs:
URI: https://repository.cshl.edu/id/eprint/25366

Actions (login required)

Administrator's edit/view item Administrator's edit/view item
CSHL HomeAbout CSHLResearchEducationNews & FeaturesCampus & Public EventsCareersGiving