Interpretably deep learning amyloid nucleation by massive experimental quantification of random sequences

Thompson, Mike, Martin, Mariano, Olmo, Trinidad Sanmartin, Rajesh, Chandana, Koo, Peter, Bolognesi, Benedetta, Lehner, Ben (July 2024) Interpretably deep learning amyloid nucleation by massive experimental quantification of random sequences. bioRxiv. (Submitted)

[thumbnail of 10.1101.2024.07.13.603366.pdf] PDF
10.1101.2024.07.13.603366.pdf - Submitted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (3MB)
DOI: 10.1101/2024.07.13.603366

Abstract

Insoluble amyloid aggregates are the hallmarks of more than fifty human diseases, including the most common neurodegenerative disorders. The process by which soluble proteins nucleate to form amyloid fibrils is, however, quite poorly characterized. Relatively few sequences are known that form amyloids with high propensity and this data shortage likely limits our capacity to understand, predict, engineer, and prevent the formation of amyloid fibrils. Here we quantify the nucleation of amyloids at an unprecedented scale and use the data to train a deep learning model of amyloid nucleation. In total, we quantify the nucleation rates of >100,000 20-amino-acid-long peptides. This large and diverse dataset allows us to train CANYA, a convolution-attention hybrid neural network. CANYA is fast and outperforms existing methods with stable performance across diverse prediction tasks. Interpretability analyses reveal CANYA’s decision-making process and learned grammar, providing mechanistic insights into amyloid nucleation. Our results illustrate the power of massive experimental analysis of random sequence-spaces and provide an interpretable and robust neural network model to predict amyloid nucleation.

Item Type: Paper
Subjects: bioinformatics
bioinformatics > quantitative biology
CSHL Authors:
Communities: CSHL labs > Koo Lab
SWORD Depositor: CSHL Elements
Depositing User: CSHL Elements
Date: 17 July 2024
Date Deposited: 23 Jul 2024 14:55
Last Modified: 23 Jul 2024 14:55
Related URLs:
URI: https://repository.cshl.edu/id/eprint/41623

Actions (login required)

Administrator's edit/view item Administrator's edit/view item
CSHL HomeAbout CSHLResearchEducationNews & FeaturesCampus & Public EventsCareersGiving