Automated Update, Revision, and Quality Control of the Maize Genome Annotations Using MAKER-P Improves the B73 RefGen_v3 Gene Models and Identifies New Genes

Law, M., Childs, K. L., Campbell, M. S., Stein, J. C., Olson, A. J., Holt, C., Panchy, N., Lei, J., Jiao, D., Andorf, C. M., Lawrence, C. J., Ware, D., Shiu, S. H., Sun, Y., Jiang, N., Yandell, M. (January 2015) Automated Update, Revision, and Quality Control of the Maize Genome Annotations Using MAKER-P Improves the B73 RefGen_v3 Gene Models and Identifies New Genes. Plant Physiol, 167 (1). pp. 25-39. ISSN 0032-0889

URL: http://www.ncbi.nlm.nih.gov/pubmed/25384563
DOI: 10.1104/pp.114.245027

Abstract

The large size and relative complexity of many plant genomes make creation, quality control, and dissemination of high-quality gene structure annotations challenging. In response, we have developed MAKER-P, a fast and easy-to-use genome annotation engine for plants. Here, we report the use of MAKER-P to update and revise the maize (Zea mays) B73 RefGen_v3 annotation build (5b+) in less than 3 h using the iPlant Cyberinfrastructure. MAKER-P identified and annotated 4,466 additional, well-supported protein-coding genes not present in the 5b+ annotation build, added additional untranslated regions to 1,393 5b+ gene models, identified 2,647 5b+ gene models that lack any supporting evidence (despite the use of large and diverse evidence data sets), identified 104,215 pseudogene fragments, and created an additional 2,522 noncoding gene annotations. We also describe a method for de novo training of MAKER-P for the annotation of newly sequenced grass genomes. Collectively, these results lead to the 6a maize genome annotation and demonstrate the utility of MAKER-P for rapid annotation, management, and quality control of grasses and other difficult-to-annotate plant genomes.

Item Type: Paper
Subjects: bioinformatics
organism description > plant > maize
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > genomes
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > genomes > genome annotation
Investigative techniques and equipment > assays > whole genome sequencing
CSHL Authors:
Communities: CSHL labs > Ware lab
Depositing User: Matt Covey
Date: January 2015
Date Deposited: 16 Jan 2015 21:21
Last Modified: 16 Jan 2015 21:21
PMCID: PMC4280997
Related URLs:
URI: https://repository.cshl.edu/id/eprint/31134

Actions (login required)

Administrator's edit/view item Administrator's edit/view item
CSHL HomeAbout CSHLResearchEducationNews & FeaturesCampus & Public EventsCareersGiving