Sugarcane genome sequencing by methylation filtration provides tools for genomic research in the genus Saccharum

Grativol, C., Regulski, M., Bertalan, M., McCombie, W. R., da Silva, F. R., Zerlotini Neto, A., Vicentini, R., Farinelli, L., Hemerly, A. S., Martienssen, R. A., Ferreira, P. C. G. (July 2014) Sugarcane genome sequencing by methylation filtration provides tools for genomic research in the genus Saccharum. Plant Journal, 79 (1). pp. 162-172. ISSN 09607412

Abstract

Many economically important crops have large and complex genomes that hamper their sequencing by standard methods such as whole genome shotgun (WGS). Large tracts of methylated repeats occur in plant genomes that are interspersed by hypomethylated gene-rich regions. Gene-enrichment strategies based on methylation profiles offer an alternative to sequencing repetitive genomes. Here, we have applied methyl filtration with McrBC endonuclease digestion to enrich for euchromatic regions in the sugarcane genome. To verify the efficiency of methylation filtration and the assembly quality of sequences submitted to gene-enrichment strategy, we have compared assemblies using methyl-filtered (MF) and unfiltered (UF) libraries. The use of methy filtration allowed a better assembly by filtering out 35% of the sugarcane genome and by producing 1.5× more scaffolds and 1.7× more assembled Mb in length compared with unfiltered dataset. The coverage of sorghum coding sequences (CDS) by MF scaffolds was at least 36% higher than by the use of UF scaffolds. Using MF technology, we increased by 134× the coverage of gene regions of the monoploid sugarcane genome. The MF reads assembled into scaffolds that covered all genes of the sugarcane bacterial artificial chromosomes (BACs), 97.2% of sugarcane expressed sequence tags (ESTs), 92.7% of sugarcane RNA-seq reads and 98.4% of sorghum protein sequences. Analysis of MF scaffolds from encoded enzymes of the sucrose/starch pathway discovered 291 single-nucleotide polymorphisms (SNPs) in the wild sugarcane species, S. spontaneum and S. officinarum. A large number of microRNA genes was also identified in the MF scaffolds. The information achieved by the MF dataset provides a valuable tool for genomic research in the genus Saccharum and for improvement of sugarcane as a biofuel crop.

Item Type: Paper
Subjects: bioinformatics > genomics and proteomics > genetics & nucleic acid processing > genomes > de novo assembly
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > genomes
Investigative techniques and equipment > assays > next generation sequencing
organism description > plant
Investigative techniques and equipment > assays > whole genome sequencing
CSHL Authors:
Communities: CSHL labs > Martienssen lab
CSHL labs > McCombie lab
Depositing User: Matt Covey
Date: July 2014
Date Deposited: 27 Jun 2014 15:51
Last Modified: 16 Jul 2021 18:37
PMCID: PMC4458261
Related URLs:
URI: https://repository.cshl.edu/id/eprint/30348

Actions (login required)

Administrator's edit/view item Administrator's edit/view item