Hybrid error correction and de novo assembly of single-molecule sequencing reads

Koren, S., Schatz, M. C., Walenz, B. P., Martin, J., Howard, J. T., Ganapathy, G., Wang, Z., Rasko, D. A., McCombie, W. R., Jarvis, E. D., Phillippy, A. M. (July 2012) Hybrid error correction and de novo assembly of single-molecule sequencing reads. Nature Biotechnology, 30 (7). pp. 693-700. ISSN 1087-0156

URL: http://www.ncbi.nlm.nih.gov/pubmed/22750884
DOI: 10.1038/nbt.2280

Abstract

Single-molecule sequencing instruments can generate multikilobase sequences with the potential to greatly improve genome and transcriptome assembly. However, the error rates of single-molecule reads are high, which has limited their use thus far to resequencing bacteria. To address this limitation, we introduce a correction algorithm and assembly strategy that uses short, high-fidelity sequences to correct the error in single-molecule sequences. We demonstrate the utility of this approach on reads generated by a PacBio RS instrument from phage, prokaryotic and eukaryotic whole genomes, including the previously unsequenced genome of the parrot Melopsittacus undulatus, as well as for RNA-Seq reads of the corn (Zea mays) transcriptome. Our long-read correction achieves >99.9% base-call accuracy, leading to substantially better assemblies than current sequencing strategies: in the best example, the median contig size was quintupled relative to high-coverage, second-generation assemblies. Greater gains are predicted if read lengths continue to increase, including the prospect of single-contig bacterial chromosome assembly.

Item Type: Paper
Subjects: Investigative techniques and equipment
Investigative techniques and equipment > assays
Investigative techniques and equipment > whole exome sequencing
Investigative techniques and equipment > assays > whole exome sequencing
CSHL Authors:
Communities: CSHL Cancer Center Program > Cancer Genetics
CSHL Cancer Center Shared Resources > DNA Sequencing Service
CSHL labs > McCombie lab
CSHL labs > Schatz lab
Depositing User: Matt Covey
Date: 1 July 2012
Date Deposited: 30 Jan 2013 19:59
Last Modified: 19 Jul 2021 20:43
PMCID: PMC3707490
Related URLs:
URI: https://repository.cshl.edu/id/eprint/26981

Actions (login required)

Administrator's edit/view item Administrator's edit/view item
CSHL HomeAbout CSHLResearchEducationNews & FeaturesCampus & Public EventsCareersGiving