Fast, scalable prediction of deleterious noncoding variants from functional and population genomic data

Huang, Y. F., Gulko, B., Siepel, A. (2017) Fast, scalable prediction of deleterious noncoding variants from functional and population genomic data. Nat Genet, 49 (4). pp. 618-624. ISSN 1061-4036

URL: https://www.ncbi.nlm.nih.gov/pubmed/28288115
DOI: 10.1038/ng.3810

Abstract

Many genetic variants that influence phenotypes of interest are located outside of protein-coding genes, yet existing methods for identifying such variants have poor predictive power. Here we introduce a new computational method, called LINSIGHT, that substantially improves the prediction of noncoding nucleotide sites at which mutations are likely to have deleterious fitness consequences, and which, therefore, are likely to be phenotypically important. LINSIGHT combines a generalized linear model for functional genomic data with a probabilistic model of molecular evolution. The method is fast and highly scalable, enabling it to exploit the 'big data' available in modern genomics. We show that LINSIGHT outperforms the best available methods in identifying human noncoding variants associated with inherited diseases. In addition, we apply LINSIGHT to an atlas of human enhancers and show that the fitness consequences at enhancers depend on cell type, tissue specificity, and constraints at associated promoters.

Item Type: Paper
Subjects: bioinformatics > genomics and proteomics > genetics & nucleic acid processing > population genetics
CSHL Authors:
Communities: CSHL labs > Siepel lab
Depositing User: Matt Covey
Date Deposited: 17 Mar 2017 20:40
Last Modified: 08 Sep 2017 19:39
PMCID: PMC5395419
Related URLs:
URI: http://repository.cshl.edu/id/eprint/34276

Actions (login required)

Administrator's edit/view item Administrator's edit/view item
CSHL HomeAbout CSHLResearchEducationNews & FeaturesCampus & Public EventsCareersGiving