Selecting deep neural networks that yield consistent attribution-based interpretations for genomics

Majdandzic, Antonio, Rajesh, Chandana, Tang, Amber, Toneyan, Shushan, Labelson, Ethan, Tripathy, Rohit, Koo, Peter K (November 2022) Selecting deep neural networks that yield consistent attribution-based interpretations for genomics. Proc Mach Learn Res, 200. pp. 131-149. ISSN 2640-3498

Preview

PDF
Selecting deep neural networks that yield consistent attribution-based interpretations for genomics.pdf - Published Version
Available under License Creative Commons Attribution.
Download (1MB) | Preview

URL: https://www.ncbi.nlm.nih.gov/pubmed/37205975

Abstract

Deep neural networks (DNNs) have advanced our ability to take DNA primary sequence as input and predict a myriad of molecular activities measured via high-throughput functional genomic assays. Post hoc attribution analysis has been employed to provide insights into the importance of features learned by DNNs, often revealing patterns such as sequence motifs. However, attribution maps typically harbor spurious importance scores to an extent that varies from model to model, even for DNNs whose predictions generalize well. Thus, the standard approach for model selection, which relies on performance of a held-out validation set, does not guarantee that a high-performing DNN will provide reliable explanations. Here we introduce two approaches that quantify the consistency of important features across a population of attribution maps; consistency reflects a qualitative property of human interpretable attribution maps. We employ the consistency metrics as part of a multivariate model selection framework to identify models that yield high generalization performance and interpretable attribution analysis. We demonstrate the efficacy of this approach across various DNNs quantitatively with synthetic data and qualitatively with chromatin accessibility data.

Item Type:	Paper
Subjects:	bioinformatics bioinformatics > genomics and proteomics > genetics & nucleic acid processing bioinformatics > genomics and proteomics bioinformatics > genomics and proteomics > genetics & nucleic acid processing > genomes > comparative genomics bioinformatics > genomics and proteomics > genetics & nucleic acid processing > genomes organs, tissues, organelles, cell types and functions > tissues types and functions > neural networks
CSHL Authors:	Majdandzic, Antonio Toneyan, Shushan Tripathy, Rohit Koo, Peter K Rajesh, Chandana Labelson, Ethan
Communities:	CSHL Cancer Center Program CSHL Cancer Center Program > Gene Regulation and Inheritance Program CSHL Cancer Center Shared Resources CSHL labs > Koo Lab School of Biological Sciences > Publications
SWORD Depositor:	CSHL Elements
Depositing User:	CSHL Elements
Date:	November 2022
Date Deposited:	21 Sep 2023 19:23
Last Modified:	29 Feb 2024 18:17
PMCID:	PMC10194041
URI:	https://repository.cshl.edu/id/eprint/40956

Actions (login required)

Administrator's edit/view item