Fischer, Stephan, Gillis, Jesse (September 2021) Defining the extent of gene function using ROC curvature. BioRxiv. (Unpublished)
PDF
2021.Fischer.gene_function.pdf Available under License Creative Commons Attribution Non-commercial No Derivatives. Download (3MB) |
Abstract
Machine learning in genomics plays a key role in leveraging high-throughput data, but assessing the generalizability of performance has been a persistent challenge. Here, we propose to evaluate the generalizability of gene characterizations through the shape of performance curves. We identify Functional Equivalence Classes (FECs), uniform subsets of annotated and unannotated genes that jointly drive performance, by assessing the presence of straight lines in ROC curves. FECs are widespread across modalities and methods, and can be used to evaluate the extent and contextspecificity of functional annotations in a data-driven manner. For example, FECs suggest that B cell markers can be decomposed into shared primary markers (10 to 50 genes), and tissue-specific secondary markers (100 to 500genes). In addition, FECs are compatible with a wide range of functional encodings, with marker sets spanning at most 5% of the genome and data-driven extensions of Gene Ontology sets spanning up to 40% of the genome. Simple to assess visually and statistically, the identification of FECs in performance curves paves the way for novel functional characterization and increased robustness in analysis
Item Type: | Paper |
---|---|
Subjects: | organs, tissues, organelles, cell types and functions > cell types and functions > cell types > B cells organs, tissues, organelles, cell types and functions > cell types and functions > cell types > B cells organs, tissues, organelles, cell types and functions > cell types and functions > cell types > B cells bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification > genes, structure and function bioinformatics > computational biology > algorithms > machine learning |
CSHL Authors: | |
Communities: | CSHL labs > Gillis Lab |
SWORD Depositor: | CSHL Elements |
Depositing User: | CSHL Elements |
Date: | 5 September 2021 |
Date Deposited: | 23 Sep 2021 15:17 |
Last Modified: | 23 Sep 2021 15:17 |
URI: | https://repository.cshl.edu/id/eprint/40361 |
Actions (login required)
Administrator's edit/view item |