Finishing the euchromatic sequence of the human genome

International Human Genome Sequencing Consortium, (2004) Finishing the euchromatic sequence of the human genome. Nature, 431 (7011). pp. 931-45. ISSN 0028-0836

URL: http://www.ncbi.nlm.nih.gov/pubmed/15496913
DOI: 10.1038/nature03001

Abstract

The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers approximately 99% of the euchromatic genome and is accurate to an error rate of approximately 1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human genome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead.

Item Type: Paper
Uncontrolled Keywords: Amino Acid Sequence Base Sequence Centromere/genetics Chromosomes, Artificial, Bacterial Chromosomes, Human/genetics DNA, Complementary/genetics Euchromatin/*genetics Gene Duplication Genes/genetics *Genome, Human Heterochromatin/genetics Human Genome Project Humans Molecular Sequence Data Multigene Family/genetics *Physical Chromosome Mapping Plasmids Pseudogenes/genetics Research Design Sensitivity and Specificity *Sequence Analysis, DNA Telomere/genetics
Subjects: bioinformatics
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification > chromosome
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification > chromosomes, structure and function > chromosome

bioinformatics > genomics and proteomics > genetics & nucleic acid processing > genomes
Investigative techniques and equipment > assays > whole genome sequencing
CSHL Authors:
Communities: CSHL labs > Siepel lab
Depositing User: Matt Covey
Date Deposited: 15 Jan 2015 20:17
Last Modified: 15 Jan 2015 20:17
Related URLs:
URI: http://repository.cshl.edu/id/eprint/31053

Actions (login required)

Administrator's edit/view item Administrator's edit/view item
CSHL HomeAbout CSHLResearchEducationNews & FeaturesCampus & Public EventsCareersGiving