A model-based analysis of GC-biased gene conversion in the human and chimpanzee genomes

Capra, J. A., Hubisz, M. J., Kostka, D., Pollard, K. S., Siepel, A. (2013) A model-based analysis of GC-biased gene conversion in the human and chimpanzee genomes. PLoS Genetics, 9 (8). e1003684. ISSN 15537390 (ISSN)

[img]
Preview
PDF (Paper)
Siepel PLoS Genetics 2013.pdf - Published Version

Download (931Kb) | Preview
URL: http://www.ncbi.nlm.nih.gov/pubmed/23966869
DOI: 10.1371/journal.pgen.1003684

Abstract

GC-biased gene conversion (gBGC) is a recombination-associated process that favors the fixation of G/C alleles over A/T alleles. In mammals, gBGC is hypothesized to contribute to variation in GC content, rapidly evolving sequences, and the fixation of deleterious mutations, but its prevalence and general functional consequences remain poorly understood. gBGC is difficult to incorporate into models of molecular evolution and so far has primarily been studied using summary statistics from genomic comparisons. Here, we introduce a new probabilistic model that captures the joint effects of natural selection and gBGC on nucleotide substitution patterns, while allowing for correlations along the genome in these effects. We implemented our model in a computer program, called phastBias, that can accurately detect gBGC tracts about 1 kilobase or longer in simulated sequence alignments. When applied to real primate genome sequences, phastBias predicts gBGC tracts that cover roughly 0.3% of the human and chimpanzee genomes and account for 1.2% of human-chimpanzee nucleotide differences. These tracts fall in clusters, particularly in subtelomeric regions; they are enriched for recombination hotspots and fast-evolving sequences; and they display an ongoing fixation preference for G and C alleles. They are also significantly enriched for disease-associated polymorphisms, suggesting that they contribute to the fixation of deleterious alleles. The gBGC tracts provide a unique window into historical recombination processes along the human and chimpanzee lineages. They supply additional evidence of long-term conservation of megabase-scale recombination rates accompanied by rapid turnover of hotspots. Together, these findings shed new light on the evolutionary, functional, and disease implications of gBGC. The phastBias program and our predicted tracts are freely available.

Item Type: Paper
Uncontrolled Keywords: Animals Base Sequence Chromosome Mapping *Evolution, Molecular Gene Conversion/*genetics Genome Humans Mammals Models, Theoretical Pan troglodytes/*genetics *Phylogeny Recombination, Genetic *Selection, Genetic Sequence Alignment
Subjects: bioinformatics
bioinformatics > genomics and proteomics > alignment > sequence alignment
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > genomes
Investigative techniques and equipment > assays > whole genome sequencing
CSHL Authors:
Communities: CSHL labs > Siepel lab
Depositing User: Matt Covey
Date Deposited: 15 Jan 2015 19:41
Last Modified: 15 Jan 2015 19:41
PMCID: PMC3744432
Related URLs:
URI: http://repository.cshl.edu/id/eprint/31050

Actions (login required)

Administrator's edit/view item Administrator's edit/view item
CSHL HomeAbout CSHLResearchEducationNews & FeaturesCampus & Public EventsCareersGiving