From identification to validation to gene count

Amid, C., Frankish, A., Aken, B., Ezkurdia, I., Kokocinsk, F., Gilbert, J., White, S., Carninci, P., Gingeras, T. R., Guigo, R., Searle, S., Tress, M. L., Harrow, J., Hubbard, T. (2010) From identification to validation to gene count. Genome Biology, 11 (Suppl ). ISSN 1474-760X

[thumbnail of Paper]
Preview
PDF (Paper)
From identification to validation to gene count.pdf

Download (134kB) | Preview

Abstract

The current GENCODE gene count of ~ 30,000, including 21,727 protein-coding and 8,483 RNA genes, is significantly lower than the 100,000 genes anticipated by early estimates. Accurate annotation of protein-coding and non-coding genes and pseudogenes is essential in calculating the true gene count and gaining insight into human evolution. As part of the GENCODE Consortium, the HAVANA team produces high quality manual gene annotation, which forms the basis for the reference gene set being used by the ENCODE project and provides a rich annotation of alternative splice variants and assignment of functional potential. However, the protein-coding potential of some splice variants is uncertain and valid splice variants can remain unannotated if they are absent from current cDNA libraries. Recent technological developments in sequencing and mass spectrometry have created a vast amount of new transcript and protein data that facilitate the identification and validation of new and existing transcripts, while harboring their own limitations and problems.

Item Type: Paper
Subjects: bioinformatics > genomics and proteomics > annotation
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification > transcription
bioinformatics > genomics and proteomics > annotation > gene expression profiling annotation
bioinformatics > genomics and proteomics > annotation > map annotation
bioinformatics > genomics and proteomics > annotation > sequence annotation
bioinformatics > genomics and proteomics > Mapping and Rendering > Sequence Rendering
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification > cDNA
Investigative techniques and equipment > spectroscopy > mass spectrometry
CSHL Authors:
Communities: CSHL labs > Gingeras lab
Depositing User: CSHL Librarian
Date: 2010
Date Deposited: 02 Nov 2011 13:49
Last Modified: 12 Jul 2013 18:13
PMCID: PMC3026201
Related URLs:
URI: https://repository.cshl.edu/id/eprint/23200

Actions (login required)

Administrator's edit/view item Administrator's edit/view item