Identifying Interpretable Visual Features in Artificial and Biological Neural Systems

Klindt, David, Sanborn, Sophia, Acosta, Francisco, Poitevin, Frédéric, Miolane, Nina (October 2023) Identifying Interpretable Visual Features in Artificial and Biological Neural Systems. arXiv. (Submitted)

[thumbnail of 10.48550.arXiv.2310.11431.pdf]

Preview

PDF
10.48550.arXiv.2310.11431.pdf
Available under License Creative Commons Attribution.
Download (5MB) | Preview

DOI: 10.48550/arXiv.2310.11431

Abstract

Single neurons in neural networks are often interpretable in that they represent individual, intuitively meaningful features. However, many neurons exhibit mixed selectivity, i.e., they represent multiple unrelated features. A recent hypothesis proposes that features in deep networks may be represented in superposition, i.e., on non-orthogonal axes by multiple neurons, since the number of possible interpretable features in natural data is generally larger than the number of neurons in a given network. Accordingly, we should be able to find meaningful directions in activation space that are not aligned with individual neurons. Here, we propose (1) an automated method for quantifying visual interpretability that is validated against a large database of human psychophysics judgments of neuron interpretability, and (2) an approach for finding meaningful directions in network activation space. We leverage these methods to discover directions in convolutional neural networks that are more intuitively meaningful than individual neurons, as we confirm and investigate in a series of analyses. Moreover, we apply the same method to three recent datasets of visual neural responses in the brain and find that our conclusions largely transfer to real neural data, suggesting that superposition might be deployed by the brain. This also provides a link with disentanglement and raises fundamental questions about robust, efficient and factorized representations in both artificial and biological neural systems.

Item Type:	Paper
Subjects:	bioinformatics bioinformatics > computational biology > algorithms bioinformatics > computational biology organs, tissues, organelles, cell types and functions > tissues types and functions > neural networks organs, tissues, organelles, cell types and functions organs, tissues, organelles, cell types and functions > tissues types and functions
CSHL Authors:	Klindt, David
Communities:	CSHL labs > Klindt lab
SWORD Depositor:	CSHL Elements
Depositing User:	CSHL Elements
Date:	18 October 2023
Date Deposited:	26 Mar 2024 19:35
Last Modified:	17 Jul 2024 14:05
Related URLs:	Publisher
URI:	https://repository.cshl.edu/id/eprint/41479

Actions (login required)

Administrator's edit/view item