Healy, J., Thomas, E. E., Schwartz, J. T., Wigler, M. H. (October 2003) Annotating large genomes with exact word matches. Genome Research, 13 (10). pp. 2306-2315. ISSN 1088-9051
Abstract
We have developed a tool for rapidly determining the number of exact matches of any word within large, internally repetitive genomes or sets of genomes. Thus we can readily annotate any sequence, including the entire human genome, with the counts of its constituent words. We create a Burrows-Wheeler transform of the genome, which together with auxiliary data structures facilitating counting, can reside in about one gigabyte of RAM. Our original interest was motivated by oligonucleotide probe design, and we describe a general protocol for defining unique hybridization probes. But our method also has applications for the analysis of genome structure and assembly. We demonstrate the identification of chromosome-specific repeats, and outline a general procedure for finding undiscovered repeats. We also illustrate the changing contents of the human genome assemblies by comparing the annotations built from different genome freezes.
Item Type: | Paper |
---|---|
Uncontrolled Keywords: | DATABASE database genome sequence REPUTER reputer tool TOOL |
Subjects: | bioinformatics > genomics and proteomics > databases > database construction bioinformatics > genomics and proteomics > annotation > sequence annotation bioinformatics > genomics and proteomics > databases > databases bioinformatics > genomics and proteomics > genetics & nucleic acid processing > genomes > genome rendering bioinformatics > genomics and proteomics > genetics & nucleic acid processing > genomes bioinformatics > genomics and proteomics > genetics & nucleic acid processing > genomes > genome annotation |
CSHL Authors: | |
Communities: | CSHL labs > Wigler lab School of Biological Sciences > Publications |
Depositing User: | CSHL Librarian |
Date: | October 2003 |
Date Deposited: | 12 Apr 2012 19:17 |
Last Modified: | 19 Sep 2014 14:40 |
PMCID: | PMC403711 |
Related URLs: | |
URI: | https://repository.cshl.edu/id/eprint/26195 |
Actions (login required)
Administrator's edit/view item |