Symmetry, gauge freedoms, and the interpretability of sequence-function relationships

Posfai, Anna, McCandlish, David M, Kinney, Justin B (May 2024) Symmetry, gauge freedoms, and the interpretability of sequence-function relationships. bioRxiv. (Submitted)

[thumbnail of 10.1101.2024.05.12.593774.pdf] PDF
10.1101.2024.05.12.593774.pdf - Submitted Version
Available under License Creative Commons Attribution.

Download (1MB)
URL: https://www.ncbi.nlm.nih.gov/pubmed/38798625
DOI: 10.1101/2024.05.12.593774

Abstract

Quantitative models of sequence-function relationships, which describe how biological sequences encode functional activities, are ubiquitous in modern biology. One important aspect of these models is that they commonly exhibit gauge freedoms, i.e., directions in parameter space that do not affect model predictions. In physics, gauge freedoms arise when physical theories are formulated in ways that respect fundamental symmetries. However, the connections that gauge freedoms in models of sequence-function relationships have to the symmetries of sequence space have yet to be systematically studied. Here we study the gauge freedoms of models that respect a specific symmetry of sequence space: the group of position-specific character permutations. We find that gauge freedoms arise when the transformations of model parameters that compensate for these symmetry transformations are described by redundant irreducible matrix representations. Based on this finding, we describe an "embedding distillation" procedure that enables analytic calculation of the dimension of the space of gauge freedoms, as well as efficient computation of a sparse basis for this space. Finally, we show that the ability to interpret model parameters as quantifying allelic effects places strong constraints on the form that models can take, and in particular show that all nontrivial equivariant models of allelic effects must exhibit gauge freedoms. Our work thus advances the understanding of the relationship between symmetries and gauge freedoms in quantitative models of sequence-function relationships.

Item Type: Paper
Subjects: bioinformatics
bioinformatics > genomics and proteomics
bioinformatics > quantitative biology
CSHL Authors:
Communities: CSHL labs > Kinney lab
CSHL labs > McCandlish lab
SWORD Depositor: CSHL Elements
Depositing User: CSHL Elements
Date: 13 May 2024
Date Deposited: 04 Jun 2024 15:06
Last Modified: 04 Jun 2024 15:06
PMCID: PMC11118426
Related URLs:
URI: https://repository.cshl.edu/id/eprint/41578

Actions (login required)

Administrator's edit/view item Administrator's edit/view item
CSHL HomeAbout CSHLResearchEducationNews & FeaturesCampus & Public EventsCareersGiving