Efficient approximations for learning phylogenetic HMM models from data

Jojic, V., Jojic, N., Meek, C., Geiger, D., Siepel, A., Haussler, D., Heckerman, D. (August 2004) Efficient approximations for learning phylogenetic HMM models from data. Bioinformatics, 20 Sup. i161-8. ISSN 1367-4803

URL: http://www.ncbi.nlm.nih.gov/pubmed/15262795
DOI: 10.1093/bioinformatics/bth917

Abstract

MOTIVATION: We consider models useful for learning an evolutionary or phylogenetic tree from data consisting of DNA sequences corresponding to the leaves of the tree. In particular, we consider a general probabilistic model described in Siepel and Haussler that we call the phylogenetic-HMM model which generalizes the classical probabilistic models of Neyman and Felsenstein. Unfortunately, computing the likelihood of phylogenetic-HMM models is intractable. We consider several approximations for computing the likelihood of such models including an approximation introduced in Siepel and Haussler, loopy belief propagation and several variational methods. RESULTS: We demonstrate that, unlike the other approximations, variational methods are accurate and are guaranteed to lower bound the likelihood. In addition, we identify a particular variational approximation to be best-one in which the posterior distribution is variationally approximated using the classic Neyman-Felsenstein model. The application of our best approximation to data from the cystic fibrosis transmembrane conductance regulator gene region across nine eutherian mammals reveals a CpG effect.

Item Type: Paper
Uncontrolled Keywords: *Algorithms *Artificial Intelligence Base Sequence Computer Simulation Databases, Genetic *Evolution, Molecular Markov Chains *Models, Genetic Molecular Sequence Data Pattern Recognition, Automated/methods *Phylogeny Sequence Analysis, DNA/*methods Sequence Homology, Nucleic Acid
Subjects: bioinformatics > genomics and proteomics > databases
bioinformatics > genomics and proteomics > genetics & nucleic acid processing
bioinformatics > genomics and proteomics > annotation > phylogenetic tree annotation
CSHL Authors:
Communities: CSHL labs > Siepel lab
Depositing User: Matt Covey
Date: 4 August 2004
Date Deposited: 14 Jan 2015 20:24
Last Modified: 14 Jan 2015 20:24
Related URLs:
URI: https://repository.cshl.edu/id/eprint/31073

Actions (login required)

Administrator's edit/view item Administrator's edit/view item
CSHL HomeAbout CSHLResearchEducationNews & FeaturesCampus & Public EventsCareersGiving