Liu, Lingjie, Zhao, Yixin, Hassett, Rebecca, Toneyan, Shushan, Koo, Peter K, Siepel, Adam (February 2025) Probabilistic and machine-learning methods for predicting local rates of transcription elongation from nascent RNA sequencing data. Nucleic Acids Research (NAR), 53 (4). ISSN 1362-4962 (Public Dataset)
Preview |
PDF
10.1093.nar.gkaf092.pdf - Published Version Available under License Creative Commons Attribution. Download (2MB) | Preview |
Abstract
Rates of transcription elongation vary within and across eukaryotic gene bodies. Here, we introduce new methods for predicting elongation rates from nascent RNA sequencing data. First, we devise a probabilistic model that predicts nucleotide-specific elongation rates as a generalized linear function of nearby genomic and epigenomic features. We validate this model with simulations and apply it to public PRO-seq (Precision Run-On Sequencing) and epigenomic data for four cell types, finding that reductions in local elongation rate are associated with cytosine nucleotides, DNA methylation, splice sites, RNA stem-loops, CTCF (CCCTC-binding factor) binding sites, and several histone marks, including H3K36me3 and H4K20me1. By contrast, increases in local elongation rate are associated with thymines, A+T-rich and low-complexity sequences, and H3K79me2 marks. We then introduce a convolutional neural network that improves our local rate predictions. Our analysis is the first to permit genome-wide predictions of relative nucleotide-specific elongation rates.
Actions (login required)
![]() |
Administrator's edit/view item |