Tareen, Ammar, Posfai, Anna, Ireland, William, McCandlish, David, Kinney, Justin (July 2020) MAVE-NN: learning genotype-phenotype maps from multiplex assays of variant effect. BioRxiv. (Unpublished)
PDF
2020.Tareen.GenotypePhenotypeMaps.pdf Download (2MB) |
Abstract
Multiplex assays of variant effect (MAVEs), which include massively parallel reporter assays (MPRAs) and deep mutational scanning (DMS) experiments, are being rapidly adopted in many areas of biology. However, inferring quantitative models of genotype-phenotype (G-P) maps from MAVE data remains challenging, and different inference approaches have been advocated in different MAVE contexts. Here we introduce a conceptually unified approach to the problem of learning G-P maps from MAVE data. Our strategy is grounded in concepts from information theory, and is based on the view of G-P maps as a form of information compression. We also introduce MAVE-NN, a Python package that implements this approach using a neural network backend. The capabilities and advantages of MAVE-NN are then demonstrated on three diverse DMS and MPRA datasets. MAVE-NN thus fills a major need in the computational analysis of MAVE data. Installation instructions, tutorials, and documentation are provided at https://mavenn.readthedocs.io .
Item Type: | Paper |
---|---|
Subjects: | bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification > genes, structure and function > gene expression bioinformatics > computational biology > algorithms > machine learning organs, tissues, organelles, cell types and functions > tissues types and functions > neural networks |
CSHL Authors: | |
Communities: | CSHL labs > Kinney lab CSHL labs > McCandlish lab |
SWORD Depositor: | CSHL Elements |
Depositing User: | CSHL Elements |
Date: | 14 July 2020 |
Date Deposited: | 07 May 2021 14:20 |
Last Modified: | 29 Apr 2024 15:26 |
URI: | https://repository.cshl.edu/id/eprint/40046 |
Actions (login required)
Administrator's edit/view item |