The status of the human gene catalogue

Amaral, Paulo, Carbonell-Sala, Silvia, De La Vega, Francisco M, Faial, Tiago, Frankish, Adam, Gingeras, Thomas, Guigo, Roderic, Harrow, Jennifer L, Hatzigeorgiou, Artemis G, Johnson, Rory, Murphy, Terence D, Pertea, Mihaela, Pruitt, Kim D, Pujar, Shashikant, Takahashi, Hazuki, Ulitsky, Igor, Varabyou, Ales, Wells, Christine A, Yandell, Mark, Carninci, Piero, Salzberg, Steven L (October 2023) The status of the human gene catalogue. Nature, 622 (7981). pp. 41-47. ISSN 0028-0836

URL: https://www.ncbi.nlm.nih.gov/pubmed/37794265
DOI: 10.1038/s41586-023-06490-x

Abstract

Scientists have been trying to identify every gene in the human genome since the initial draft was published in 2001. In the years since, much progress has been made in identifying protein-coding genes, currently estimated to number fewer than 20,000, with an ever-expanding number of distinct protein-coding isoforms. Here we review the status of the human gene catalogue and the efforts to complete it in recent years. Beside the ongoing annotation of protein-coding genes, their isoforms and pseudogenes, the invention of high-throughput RNA sequencing and other technological breakthroughs have led to a rapid growth in the number of reported non-coding RNA genes. For most of these non-coding RNAs, the functional relevance is currently unclear; we look at recent advances that offer paths forward to identifying their functions and towards eventually completing the human gene catalogue. Finally, we examine the need for a universal annotation standard that includes all medically significant genes and maintains their relationships with different reference genomes for the use of the human gene catalogue in clinical settings.

Item Type: Paper
Subjects: bioinformatics > genomics and proteomics > annotation
bioinformatics
bioinformatics > genomics and proteomics > genetics & nucleic acid processing
bioinformatics > genomics and proteomics
bioinformatics > genomics and proteomics > annotation > sequence annotation
organism description > animal
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > genomes
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > genomes > genome annotation
organism description > animal > mammal > primates > hominids
organism description > animal > mammal > primates > hominids > human
organism description > animal > mammal
organism description > animal > mammal > primates
CSHL Authors:
Communities: CSHL labs > Gingeras lab
SWORD Depositor: CSHL Elements
Depositing User: CSHL Elements
Date: October 2023
Date Deposited: 10 Oct 2023 15:05
Last Modified: 08 Jan 2024 16:06
Related URLs:
URI: https://repository.cshl.edu/id/eprint/41169

Actions (login required)

Administrator's edit/view item Administrator's edit/view item
CSHL HomeAbout CSHLResearchEducationNews & FeaturesCampus & Public EventsCareersGiving