How Much Information is Provided by Human Epigenomic Data? An Evolutionary View

Gulko, Brad, Siepel, Adam (May 2018) How Much Information is Provided by Human Epigenomic Data? An Evolutionary View. BioRxiv. (Unpublished)

[thumbnail of 2018.Gulko.human_epigenomic_data.pdf] PDF
2018.Gulko.human_epigenomic_data.pdf
Available under License Creative Commons Attribution.

Download (4MB)
DOI: 10.1101/317719

Abstract

<h4>ABSTRACT</h4> Here, we ask the question, “How much information do available epigenomic data sets provide about human genomic function, individually or in combination?” We consider nine epigenomic and annotation features across 115 cell types and measure genomic function by using signatures of natural selection as a proxy. We measure information as the reduction in entropy under a probabilistic evolutionary model that describes genetic variation across ∼50 diverse humans and several nonhuman primates. We find that several genomic features yield more information in combination than they do individually, with DNase-seq displaying particularly strong synergy. Most of the entropy in human genetic variation, by far, reflects mutation and neutral drift; the genome-wide reduction in entropy due to selection is equivalent to only a small fraction of the storage requirements of a single human genome. Based on this framework, we produce cell-type-specific maps of the probability that a mutation at each nucleotide will have fitness consequences ( FitCons scores). These scores are predictive of known functional elements and disease-associated variants, they reveal relationships among cell types, and they suggest that ∼8% of nucleotide sites are constrained by natural selection.

Item Type: Paper
Subjects: bioinformatics
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification
bioinformatics > genomics and proteomics > genetics & nucleic acid processing
bioinformatics > genomics and proteomics
organism description > animal
bioinformatics > computational biology
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > epigenetics
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification > epigenetics
organism description > animal > mammal > primates > hominids
organism description > animal > mammal > primates > hominids > human
organism description > animal > mammal
organism description > animal > mammal > primates
CSHL Authors:
Communities: CSHL labs > Siepel lab
SWORD Depositor: CSHL Elements
Depositing User: CSHL Elements
Date: 9 May 2018
Date Deposited: 24 May 2021 18:05
Last Modified: 20 Feb 2024 20:12
URI: https://repository.cshl.edu/id/eprint/40154

Actions (login required)

Administrator's edit/view item Administrator's edit/view item
CSHL HomeAbout CSHLResearchEducationNews & FeaturesCampus & Public EventsCareersGiving