Statistical correction of input gradients for black box models trained with categorical input features

Majdandzic, Antonio, Koo, Peter K (May 2022) Statistical correction of input gradients for black box models trained with categorical input features. BioRxiv. (Unpublished)

[thumbnail of 2022.Majdandzic.black_box_models.pdf] PDF
2022.Majdandzic.black_box_models.pdf
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (6MB)

Abstract

Gradients of a deep neural network’s predictions with respect to the inputs are used in a variety of downstream analyses, notably in post hoc explanations with feature attribution methods. For data with input features that live on a lower-dimensional manifold, we observe that the learned function can exhibit arbitrary behaviors off the manifold, where no data exists to anchor the function during training. This leads to a random component in the gradients which manifests as noise. We introduce a simple correction for this off-manifold gradient noise for the case of categorical input features, where input values are subject to a probabilistic simplex constraint, and demonstrate its effectiveness on regulatory genomics data. We find that our correction consistently leads to a significant improvement in gradient-based attribution scores.

Item Type: Paper
Subjects: bioinformatics
bioinformatics > genomics and proteomics > genetics & nucleic acid processing
bioinformatics > genomics and proteomics
bioinformatics > computational biology > algorithms
bioinformatics > computational biology
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > genomes
bioinformatics > computational biology > algorithms > machine learning
CSHL Authors:
Communities: CSHL labs > Koo Lab
SWORD Depositor: CSHL Elements
Depositing User: CSHL Elements
Date: 1 May 2022
Date Deposited: 25 May 2022 15:44
Last Modified: 16 Jan 2024 18:56
URI: https://repository.cshl.edu/id/eprint/40623

Actions (login required)

Administrator's edit/view item Administrator's edit/view item