Deconvolution of Expression for Nascent RNA Sequencing Data (DENR) Highlights Pre-RNA Isoform Diversity in Human Cells

Zhao, Yixin, Dukler, Noah, Barshad, Gilad, Toneyan, Shushan, Danko, Charles, Siepel, Adam (March 2021) Deconvolution of Expression for Nascent RNA Sequencing Data (DENR) Highlights Pre-RNA Isoform Diversity in Human Cells. BioRxiv. (Unpublished)

[thumbnail of 2021.Zhao.DENR.pdf] PDF
2021.Zhao.DENR.pdf
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (1MB)
DOI: 10.1101/2021.03.16.435537

Abstract

Quantification of mature-RNA isoform abundance from RNA-seq data has been extensively studied, but much less attention has been devoted to quantifying the abundance of distinct precursor RNAs based on nascent RNA sequencing data. Here we address this problem with a new computational method called Deconvolution of Expression for Nascent RNA sequencing data (DENR). DENR models the nascent RNA read counts at each locus as a mixture of user-provided isoforms. The performance of the baseline algorithm is enhanced by the use of machine-learning predictions of transcription start sites (TSSs) and an adjustment for the typical “shape profile” of read counts along a transcription unit. We show using simulated data that DENR clearly outperforms simple read-count-based methods for estimating the abundances of both whole genes and isoforms. By applying DENR to previously published PRO-seq data from K562 and CD4+ T cells, we find that transcription of multiple isoforms per gene is widespread, and the dominant isoform frequently makes use of an internal TSS. We also identify > 200 genes whose dominant isoforms make use of different TSSs in these two cell types. Finally, we apply DENR and StringTie to newly generated PRO-seq and RNA-seq data, respectively, for human CD4+ T cells and CD14+ monocytes, and show that entropy at the pre-RNA level makes a disproportionate contribution to overall isoform diversity, especially across cell types. Altogether, DENR is the first computational tool to enable abundance quantification of pre-RNA isoforms based on nascent RNA sequencing data, and it reveals high levels of pre-RNA isoform diversity in human cells.

Item Type: Paper
Subjects: bioinformatics > computational biology
organism description > animal > mammal > primates > hominids > human
Investigative techniques and equipment > assays > RNA-seq
CSHL Authors:
Communities: CSHL labs > Koo Lab
CSHL labs > Siepel lab
School of Biological Sciences > Publications
SWORD Depositor: CSHL Elements
Depositing User: CSHL Elements
Date: 17 March 2021
Date Deposited: 26 May 2022 16:03
Last Modified: 01 Jun 2022 17:29
URI: https://repository.cshl.edu/id/eprint/40637

Actions (login required)

Administrator's edit/view item Administrator's edit/view item
CSHL HomeAbout CSHLResearchEducationNews & FeaturesCampus & Public EventsCareersGiving