EvoAug-TF: Extending evolution-inspired data augmentations for genomic deep learning to TensorFlow

Yu, Yiyang, Muthukumar, Shivani, Koo, Peter K (February 2024) EvoAug-TF: Extending evolution-inspired data augmentations for genomic deep learning to TensorFlow. Bioinformatics. ISSN 1367-4811

[thumbnail of 2024_Yu_EvoAug.pdf]
Preview
PDF
2024_Yu_EvoAug.pdf - Published Version
Available under License Creative Commons Attribution.

Download (500kB) | Preview

Abstract

SUMMARY: Deep neural networks (DNNs) have been widely applied to predict the molecular functions of the non-coding genome. DNNs are data hungry and thus require many training examples to fit data well. However, functional genomics experiments typically generate limited amounts of data, constrained by the activity levels of the molecular function under study inside the cell. Recently, EvoAug was introduced to train a genomic DNN with evolution-inspired augmentations. EvoAug-trained DNNs have demonstrated improved generalization and interpretability with attribution analysis. However, EvoAug only supports PyTorch-based models, which limits its applications to a broad class of genomic DNNs based in TensorFlow. Here, we extend EvoAug's functionality to TensorFlow in a new package we call EvoAug-TF. Through a systematic benchmark, we find that EvoAug-TF yields comparable performance with the original EvoAug package. AVAILABILITY: EvoAug-TF is freely available for users and is distributed under an open-source MIT license. Researchers can access the open-source code on GitHub (https://github.com/p-koo/evoaug-tf). The pre-compiled package is provided via PyPI (https://pypi.org/project/evoaug-tf) with in-depth documentation on ReadTheDocs (https://evoaug-tf.readthedocs.io). The scripts for reproducing the results are available at (https://github.com/p-koo/evoaug-tf_analysis).

Item Type: Paper
Subjects: bioinformatics
bioinformatics > genomics and proteomics
bioinformatics > computational biology > algorithms
bioinformatics > computational biology
bioinformatics > computational biology > algorithms > machine learning
CSHL Authors:
Communities: CSHL labs > Koo Lab
SWORD Depositor: CSHL Elements
Depositing User: CSHL Elements
Date: 16 February 2024
Date Deposited: 23 Feb 2024 13:47
Last Modified: 20 Nov 2024 16:25
PMCID: PMC10918628
Related URLs:
URI: https://repository.cshl.edu/id/eprint/41440

Actions (login required)

Administrator's edit/view item Administrator's edit/view item