The GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene structure, evolution, and expression

Derrien, T., Johnson, R., Bussotti, G., Tanzer, A., Djebali, S., Tilgner, H., Guernec, G., Martin, D., Merkel, A., Knowles, D. G., Lagarde, J., Veeravalli, L., Ruan, X. A., Ruan, Y. J., Lassmann, T., Carninci, P., Brown, J. B., Lipovich, L., Gonzalez, J. M., Thomas, M., Davis, C. A., Shiekhattar, R., Gingeras, T. R., Hubbard, T. J., Notredame, C., Harrow, J., Guigo, R. (September 2012) The GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene structure, evolution, and expression. Genome Research, 22 (9 Spe). pp. 1775-1789. ISSN 1088-9051

URL: http://www.ncbi.nlm.nih.gov/pubmed/22955988
DOI: 10.1101/gr.132159.111

Abstract

The human genome contains many thousands of long noncoding RNAs (IncRNAs). While several studies have demonstrated compelling biological and disease roles for individual examples, analytical and experimental approaches to investigate these genes have been hampered by the lack of comprehensive IncRNA annotation. Here, we present and analyze the most complete human IncRNA annotation to date, produced by the GENCODE consortium within the framework of the ENCODE project and comprising 9277 manually annotated genes producing 14,880 transcripts. Our analyses indicate that IncRNAs are generated through pathways similar to that of protein-coding genes, with similar histone-modification profiles, splicing signals, and exon /intron lengths. In contrast to protein-coding genes, however, IncRNAs display a striking bias toward two-exon transcripts, they are predominantly localized in the chromatin and nucleus, and a fraction appear to be preferentially processed into small RNAs. They are under stronger selective pressure than neutrally evolving sequences-particularly in their promoter regions, which display levels of selection comparable to protein-coding genes. Importantly, about one-third seem to have arisen within the primate lineage. Comprehensive analysis of their expression in multiple human organs and brain regions shows that IncRNAs are generally lower expressed than protein-coding genes, and display more tissue-specific expression patterns, with a large fraction of tissue-specific IncRNAs expressed in the brain. Expression correlation analysis indicates that IncRNAs show particularly striking positive correlation with the expression of antisense coding genes. This GENCODE annotation represents a valuable resource for future studies of IncRNAs.

Item Type: Paper
Uncontrolled Keywords: messenger-rna human genome chromatin identification transcription annotation elements database reveals product
Subjects: bioinformatics
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification
bioinformatics > genomics and proteomics > genetics & nucleic acid processing
bioinformatics > genomics and proteomics
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification > genes, structure and function > gene expression
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification > genes, structure and function
CSHL Authors:
Communities: CSHL labs > Gingeras lab
Depositing User: Matt Covey
Date: September 2012
Date Deposited: 31 Jan 2013 20:35
Last Modified: 06 Apr 2015 19:07
PMCID: PMC3431493
Related URLs:
URI: https://repository.cshl.edu/id/eprint/26927

Actions (login required)

Administrator's edit/view item Administrator's edit/view item
CSHL HomeAbout CSHLResearchEducationNews & FeaturesCampus & Public EventsCareersGiving