Xuan, Z., Wang, J., Zhang, M. Q. (2003) Computational comparison of two mouse draft genomes and the human golden path. Genome Biology, 4 (1). R1. ISSN 1474-7596
Preview |
PDF (Paper)
Zhang Genome Biology 2003.pdf - Published Version Download (626kB) | Preview |
Abstract
BACKGROUND: The availability of both mouse and human draft genomes has marked the beginning of a new era of comparative mammalian genomics. The two available mouse genome assemblies, from the public mouse genome sequencing consortium and Celera Genomics, were obtained using different clone libraries and different assembly methods. RESULTS: We present here a critical comparison of the two latest mouse genome assemblies. The utility of the combined genomes is further demonstrated by comparing them with the human 'golden path' and through a subsequent analysis of a resulting conserved sequence element (CSE) database, which allows us to identify over 6,000 potential novel genes and to derive independent estimates of the number of human protein-coding genes. CONCLUSION: The Celera and public mouse assemblies differ in about 10% of the mouse genome. Each assembly has advantages over the other: Celera has higher accuracy in base-pairs and overall higher coverage of the genome; the public assembly, however, has higher sequence quality in some newly finished bacterial artificial chromosome clone (BAC) regions and the data are freely accessible. Perhaps most important, by combining both assemblies, we can get a better annotation of the human genome; in particular, we can obtain the most complete set of CSEs, one third of which are related to known genes and some others are related to other functional genomic regions. More than half the CSEs are of unknown function. From the CSEs, we estimate the total number of human protein-coding genes to be about 40,000. This searchable publicly available online CSEdb will expedite new discoveries through comparative genomics.
Item Type: | Paper |
---|---|
Uncontrolled Keywords: | Animals Chromosome Mapping Computational Biology/ methods Conserved Sequence/genetics Databases, Nucleic Acid Gene Duplication Genes/genetics Genome Genome, Human Humans Mice/ genetics Promoter Regions (Genetics)/genetics Synteny |
Subjects: | bioinformatics > genomics and proteomics > genetics & nucleic acid processing bioinformatics > genomics and proteomics bioinformatics > genomics and proteomics > Mapping and Rendering bioinformatics > genomics and proteomics > genetics & nucleic acid processing > genomes organism description > animal > mammal > rodent > mouse organism description > animal > mammal > rodent |
CSHL Authors: | |
Communities: | CSHL labs > Zhang lab |
Depositing User: | Matt Covey |
Date: | 2003 |
Date Deposited: | 01 Jul 2013 19:44 |
Last Modified: | 01 Jul 2013 19:44 |
PMCID: | PMC151282 |
Related URLs: | |
URI: | https://repository.cshl.edu/id/eprint/27875 |
Actions (login required)
Administrator's edit/view item |