Inferring Sequence-Structure Preferences of RNA-Binding Proteins with Convolutional Residual Networks

Koo, Peter K., Anand, Praveen, Paul, Steffan B., Eddy, Sean R. (2018) Inferring Sequence-Structure Preferences of RNA-Binding Proteins with Convolutional Residual Networks. bioRxiv. p. 418459. (Unpublished)

URL: http://biorxiv.org/content/early/2018/09/15/418459...
DOI: 10.1101/418459

Abstract

To infer the sequence and RNA structure specificities of RNA-binding proteins (RBPs) from experiments that enrich for bound sequences, we introduce a convolutional residual network which we call ResidualBind. ResidualBind significantly outperforms previous methods on experimental data from many RBP families. We interrogate ResidualBind to identify what features it has learned from high-affinity sequences with saliency analysis along with 1st-order and 2nd-order in silico mutagenesis. We show that in addition to sequence motifs, ResidualBind learns a model that includes the number of motifs, their spacing, and both positive and negative effects of RNA structure context. Strikingly, ResidualBind learns RNA structure context, including detailed base-pairing relationships, directly from sequence data, which we confirm on synthetic data. ResidualBind is a powerful, flexible, and interpretable model that can uncover cis-recognition preferences across a broad spectrum of RBPs.

Item Type: Paper
Subjects: bioinformatics > computational biology > algorithms > machine learning
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > protein structure, function, modification > protein types > RNA binding protein
CSHL Authors:
Communities: CSHL labs > Koo Lab
Depositing User: Matthew Dunn
Date: 2018
Date Deposited: 16 Sep 2019 18:33
Last Modified: 16 Sep 2019 18:58
URI: https://repository.cshl.edu/id/eprint/38382

Actions (login required)

Administrator's edit/view item Administrator's edit/view item
CSHL HomeAbout CSHLResearchEducationNews & FeaturesCampus & Public EventsCareersGiving