Koo, Peter K., Anand, Praveen, Paul, Steffan B., Eddy, Sean R. (2018) Inferring Sequence-Structure Preferences of RNA-Binding Proteins with Convolutional Residual Networks. bioRxiv. p. 418459. (Unpublished)
Abstract
To infer the sequence and RNA structure specificities of RNA-binding proteins (RBPs) from experiments that enrich for bound sequences, we introduce a convolutional residual network which we call ResidualBind. ResidualBind significantly outperforms previous methods on experimental data from many RBP families. We interrogate ResidualBind to identify what features it has learned from high-affinity sequences with saliency analysis along with 1st-order and 2nd-order in silico mutagenesis. We show that in addition to sequence motifs, ResidualBind learns a model that includes the number of motifs, their spacing, and both positive and negative effects of RNA structure context. Strikingly, ResidualBind learns RNA structure context, including detailed base-pairing relationships, directly from sequence data, which we confirm on synthetic data. ResidualBind is a powerful, flexible, and interpretable model that can uncover cis-recognition preferences across a broad spectrum of RBPs.
Item Type: | Paper |
---|---|
Subjects: | bioinformatics > computational biology > algorithms > machine learning bioinformatics > genomics and proteomics > genetics & nucleic acid processing > protein structure, function, modification > protein types > RNA binding protein |
CSHL Authors: | |
Communities: | CSHL labs > Koo Lab |
Depositing User: | Matthew Dunn |
Date: | 2018 |
Date Deposited: | 16 Sep 2019 18:33 |
Last Modified: | 16 Sep 2019 18:58 |
URI: | https://repository.cshl.edu/id/eprint/38382 |
Actions (login required)
Administrator's edit/view item |