Detection of nonneutral substitution rates on mammalian phylogenies

Pollard, K. S., Hubisz, M. J., Rosenbloom, K. R., Siepel, A. (January 2010) Detection of nonneutral substitution rates on mammalian phylogenies. Genome Res, 20 (1). pp. 110-21. ISSN 1088-9051

[thumbnail of Paper]
Preview
PDF (Paper)
Siepel Genome Research 2010.pdf - Published Version

Download (1MB) | Preview

Abstract

Methods for detecting nucleotide substitution rates that are faster or slower than expected under neutral drift are widely used to identify candidate functional elements in genomic sequences. However, most existing methods consider either reductions (conservation) or increases (acceleration) in rate but not both, or assume that selection acts uniformly across the branches of a phylogeny. Here we examine the more general problem of detecting departures from the neutral rate of substitution in either direction, possibly in a clade-specific manner. We consider four statistical, phylogenetic tests for addressing this problem: a likelihood ratio test, a score test, a test based on exact distributions of numbers of substitutions, and the genomic evolutionary rate profiling (GERP) test. All four tests have been implemented in a freely available program called phyloP. Based on extensive simulation experiments, these tests are remarkably similar in statistical power. With 36 mammalian species, they all appear to be capable of fairly good sensitivity with low false-positive rates in detecting strong selection at individual nucleotides, moderate selection in 3-bp elements, and weaker or clade-specific selection in longer elements. By applying phyloP to mammalian multiple alignments from the ENCODE project, we shed light on patterns of conservation/acceleration in known and predicted functional elements, approximate fractions of sites subject to constraint, and differences in clade-specific selection in the primate and glires clades. We also describe new "Conservation" tracks in the UCSC Genome Browser that display both phyloP and phastCons scores for genome-wide alignments of 44 vertebrate species.

Item Type: Paper
Uncontrolled Keywords: Animals *Base Sequence Computer Simulation Conserved Sequence *Evolution, Molecular Humans Likelihood Functions Mammals/classification/*genetics Models, Genetic Models, Statistical *Phylogeny Primates/genetics *Selection, Genetic Sequence Alignment Software Species Specificity
Subjects: bioinformatics > genomics and proteomics > alignment > sequence alignment
bioinformatics > genomics and proteomics > computers > computer software
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > genomes
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > genomes > genome annotation
CSHL Authors:
Communities: CSHL labs > Siepel lab
Depositing User: Matt Covey
Date: January 2010
Date Deposited: 13 Jan 2015 19:35
Last Modified: 13 Jan 2015 19:35
PMCID: PMC2798823
Related URLs:
URI: https://repository.cshl.edu/id/eprint/31089

Actions (login required)

Administrator's edit/view item Administrator's edit/view item