Sequence features in regions of weak and strong linkage disequilibrium

Smith, A. V., Thomas, D. J., Munro, H. M., Abecasis, G. R. (November 2005) Sequence features in regions of weak and strong linkage disequilibrium. Genome Res, 15 (11). pp. 1519-34. ISSN 1088-9051 (Print)

[img]
Preview
PDF (Paper)
Sequence features in regions of weak.pdf - Published Version

Download (1MB) | Preview
URL: http://www.genome.org/cgi/content/full/15/11/1519
DOI: 10.1101/gr.4421405

Abstract

We use genotype data generated by the International HapMap Project to dissect the relationship between sequence features and the degree of linkage disequilibrium in the genome. We show that variation in linkage disequilibrium is broadly similar across populations and examine sequence landscape in regions of strong and weak disequilibrium. Linkage disequilibrium is generally low within approximately 15 Mb of the telomeres of each chromosome and noticeably elevated in large, duplicated regions of the genome as well as within approximately 5 Mb of centromeres and other heterochromatic regions. At a broad scale (100-1000 kb resolution), our results show that regions of strong linkage disequilibrium are typically GC poor and have reduced polymorphism. In addition, these regions are enriched for LINE repeats, but have fewer SINE, DNA, and simple repeats than the rest of the genome. At a fine scale, we examine the sequence composition of "hotspots" for the rapid breakdown of linkage disequilibrium and show that they are enriched in SINEs, in simple repeats, and in sequences that are conserved between species. Regions of high and low linkage disequilibrium (the top and bottom quartiles of the genome) have a higher density of genes and coding bases than the rest of the genome. Closer examination of the data shows that whereas some types of genes (including genes involved in immune response and sensory perception) are typically located in regions of low linkage disequilibrium, other genes (including those involved in DNA and RNA metabolism, response to DNA damage, and the cell cycle) are preferentially located in regions of strong linkage disequilibrium. Our results provide a detailed analysis of the relationship between sequence features and linkage disequilibrium and suggest an evolutionary justification for the heterogeneity in linkage disequilibrium in the genome.

Item Type: Paper
Uncontrolled Keywords: Base Composition Chromosomes Human genetics Computational Biology methods Gene Frequency Genome Human genetics Genomics methods Haplotypes genetics Humans Linkage Disequilibrium genetics Models Genetic Multivariate Analysis Short Interspersed Nucleotide Elements genetics Statistics Nonparametric Variation genetics
Subjects: bioinformatics > genomics and proteomics > databases > database search and retrieval
bioinformatics > genomics and proteomics > annotation > sequence annotation
CSHL Authors:
Depositing User: CSHL Librarian
Date: November 2005
Date Deposited: 06 Jan 2012 14:19
Last Modified: 06 Jan 2012 14:19
URI: https://repository.cshl.edu/id/eprint/22709

Actions (login required)

Administrator's edit/view item Administrator's edit/view item
CSHL HomeAbout CSHLResearchEducationNews & FeaturesCampus & Public EventsCareersGiving