Complete telomere-to-telomere de novo assembly of the Plasmodium falciparum genome through long-read (>11 kb), single molecule, real-time sequencing

Vembar, S. S., Seetin, M., Lambert, C., Nattestad, M., Schatz, M. C., Baybayan, P., Scherf, A., Smith, M. L. (August 2016) Complete telomere-to-telomere de novo assembly of the Plasmodium falciparum genome through long-read (>11 kb), single molecule, real-time sequencing. DNA Res, 23 (4). pp. 339-351. ISSN 1756-1663 (Electronic)1340-2838 (Linking)

URL: http://www.ncbi.nlm.nih.gov/pubmed/27345719
DOI: 10.1093/dnares/dsw022

Abstract

The application of next-generation sequencing to estimate genetic diversity of Plasmodium falciparum, the most lethal malaria parasite, has proved challenging due to the skewed AT-richness [ approximately 80.6% (A + T)] of its genome and the lack of technology to assemble highly polymorphic subtelomeric regions that contain clonally variant, multigene virulence families (Ex: var and rifin). To address this, we performed amplification-free, single molecule, real-time sequencing of P. falciparum genomic DNA and generated reads of average length 12 kb, with 50% of the reads between 15.5 and 50 kb in length. Next, using the Hierarchical Genome Assembly Process, we assembled the P. falciparum genome de novo and successfully compiled all 14 nuclear chromosomes telomere-to-telomere. We also accurately resolved centromeres [ approximately 90-99% (A + T)] and subtelomeric regions and identified large insertions and duplications that add extra var and rifin genes to the genome, along with smaller structural variants such as homopolymer tract expansions. Overall, we show that amplification-free, long-read sequencing combined with de novo assembly overcomes major challenges inherent to studying the P. falciparum genome. Indeed, this technology may not only identify the polymorphic and repetitive subtelomeric sequences of parasite populations from endemic areas but may also evaluate structural variation linked to virulence, drug resistance and disease transmission.

Item Type: Paper
Uncontrolled Keywords: AT-biased Plasmodium falciparum de novo assembly long-read sequencing structural variation
Subjects: diseases & disorders > parasitic diseases
Investigative techniques and equipment > assays > next generation sequencing
Investigative techniques and equipment > assays > whole genome sequencing
CSHL Authors:
Communities: CSHL labs > Schatz lab
School of Biological Sciences > Publications
Depositing User: Matt Covey
Date: August 2016
Date Deposited: 29 Jun 2016 20:17
Last Modified: 12 Oct 2016 14:42
PMCID: PMC4991835
Related URLs:
URI: https://repository.cshl.edu/id/eprint/32919

Actions (login required)

Administrator's edit/view item Administrator's edit/view item
CSHL HomeAbout CSHLResearchEducationNews & FeaturesCampus & Public EventsCareersGiving