The EN-TEx resource of multi-tissue personal epigenomes & variant-impact models

Rozowsky, Joel, Gao, Jiahao, Borsari, Beatrice, Yang, Yucheng T, Galeev, Timur, Gürsoy, Gamze, Epstein, Charles B, Xiong, Kun, Xu, Jinrui, Li, Tianxiao, Liu, Jason, Yu, Keyang, Berthel, Ana, Chen, Zhanlin, Navarro, Fabio, Sun, Maxwell S, Wright, James, Chang, Justin, Cameron, Christopher JF, Shoresh, Noam, Gaskell, Elizabeth, Drenkow, Jorg, Adrian, Jessika, Aganezov, Sergey, Aguet, François, Balderrama-Gutierrez, Gabriela, Banskota, Samridhi, Corona, Guillermo Barreto, Chee, Sora, Chhetri, Surya B, Cortez Martins, Gabriel Conte, Danyko, Cassidy, Davis, Carrie A, Farid, Daniel, Farrell, Nina P, Gabdank, Idan, Gofin, Yoel, Gorkin, David U, Gu, Mengting, Hecht, Vivian, Hitz, Benjamin C, Issner, Robbyn, Jiang, Yunzhe, Kirsche, Melanie, Kong, Xiangmeng, Lam, Bonita R, Li, Shantao, Li, Bian, Li, Xiqi, Lin, Khine Zin, Luo, Ruibang, Mackiewicz, Mark, Meng, Ran, Moore, Jill E, Mudge, Jonathan, Nelson, Nicholas, Nusbaum, Chad, Popov, Ioann, Pratt, Henry E, Qiu, Yunjiang, Ramakrishnan, Srividya, Raymond, Joe, Salichos, Leonidas, Scavelli, Alexandra, Schreiber, Jacob M, Sedlazeck, Fritz J, See, Lei Hoon, Sherman, Rachel M, Shi, Xu, Shi, Minyi, Sloan, Cricket Alicia, Strattan, J Seth, Tan, Zhen, Tanaka, Forrest Y, Vlasova, Anna, Wang, Jun, Werner, Jonathan, Williams, Brian, Xu, Min, Yan, Chengfei, Yu, Lu, Zaleski, Christopher, Zhang, Jing, Ardlie, Kristin, Cherry, J Michael, Mendenhall, Eric M, Noble, William S, Weng, Zhiping, Levine, Morgan E, Dobin, Alexander, Wold, Barbara, Mortazavi, Ali, Ren, Bing, Gillis, Jesse, Myers, Richard M, Snyder, Michael P, Choudhary, Jyoti, Milosavljevic, Aleksandar, Schatz, Michael C, Bernstein, Bradley E, Guigó, Roderic, Gingeras, Thomas R, Gerstein, Mark (March 2023) The EN-TEx resource of multi-tissue personal epigenomes & variant-impact models. Cell, 186 (7). 1493-1511.e40. ISSN 0092-8674

[thumbnail of 2023_Rozowsky_The_En-Tex_resource_of_multi-tissue_personal.pdf]
Preview
PDF
2023_Rozowsky_The_En-Tex_resource_of_multi-tissue_personal.pdf - Published Version
Available under License Creative Commons Attribution.

Download (13MB) | Preview

Abstract

Understanding how genetic variants impact molecular phenotypes is a key goal of functional genomics, currently hindered by reliance on a single haploid reference genome. Here, we present the EN-TEx resource of 1,635 open-access datasets from four donors (∼30 tissues × ∼15 assays). The datasets are mapped to matched, diploid genomes with long-read phasing and structural variants, instantiating a catalog of >1 million allele-specific loci. These loci exhibit coordinated activity along haplotypes and are less conserved than corresponding, non-allele-specific ones. Surprisingly, a deep-learning transformer model can predict the allele-specific activity based only on local nucleotide-sequence context, highlighting the importance of transcription-factor-binding motifs particularly sensitive to variants. Furthermore, combining EN-TEx with existing genome annotations reveals strong associations between allele-specific and GWAS loci. It also enables models for transferring known eQTLs to difficult-to-profile tissues (e.g., from skin to heart). Overall, EN-TEx provides rich data and generalizable models for more accurate personal functional genomics.

Item Type: Paper
Subjects: bioinformatics
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification
bioinformatics > genomics and proteomics > genetics & nucleic acid processing
bioinformatics > genomics and proteomics
Investigative techniques and equipment
Investigative techniques and equipment > assays
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > epigenetics
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification > epigenetics
Investigative techniques and equipment > assays > genome wide association studies
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > protein structure, function, modification > protein types > transcription factor
CSHL Authors:
Communities: CSHL Cancer Center Program
CSHL Cancer Center Program > Gene Regulation and Inheritance Program
CSHL labs > Gillis Lab
CSHL labs > Gingeras lab
CSHL labs > Schatz lab
CSHL labs > Wigler lab
CSHL labs > Dobin Lab
SWORD Depositor: CSHL Elements
Depositing User: CSHL Elements
Date: 30 March 2023
Date Deposited: 19 Sep 2023 13:54
Last Modified: 30 Apr 2024 19:36
PMCID: PMC10074325
Related URLs:
URI: https://repository.cshl.edu/id/eprint/40903

Actions (login required)

Administrator's edit/view item Administrator's edit/view item