Learning single-cell chromatin accessibility profiles using meta-analytic marker genes

Karakida Kawaguchi, Risa, Tang, Ziqi, Fischer, Stephan, Rajesh, Chandana, Tripathy, Rohit, Koo, Peter K, Gillis, Jesse (December 2022) Learning single-cell chromatin accessibility profiles using meta-analytic marker genes. Briefings in Bioinformatics. bbac541. ISSN 1467-5463

[thumbnail of 2023-Gillis-Learning-single-cell-chromatin-accessibility-profiles-using-meta-analytic-marker-genes.pdf] PDF
2023-Gillis-Learning-single-cell-chromatin-accessibility-profiles-using-meta-analytic-marker-genes.pdf
Available under License Creative Commons Attribution Non-commercial.

Download (1MB)
URL: https://www.ncbi.nlm.nih.gov/pubmed/36549922
DOI: 10.1093/bib/bbac541

Abstract

MOTIVATION: Single-cell assay for transposase accessible chromatin using sequencing (scATAC-seq) is a valuable resource to learn cis-regulatory elements such as cell-type specific enhancers and transcription factor binding sites. However, cell-type identification of scATAC-seq data is known to be challenging due to the heterogeneity derived from different protocols and the high dropout rate. RESULTS: In this study, we perform a systematic comparison of seven scATAC-seq datasets of mouse brain to benchmark the efficacy of neuronal cell-type annotation from gene sets. We find that redundant marker genes give a dramatic improvement for a sparse scATAC-seq annotation across the data collected from different studies. Interestingly, simple aggregation of such marker genes achieves performance comparable or higher than that of machine-learning classifiers, suggesting its potential for downstream applications. Based on our results, we reannotated all scATAC-seq data for detailed cell types using robust marker genes. Their meta scATAC-seq profiles are publicly available at https://gillisweb.cshl.edu/Meta_scATAC. Furthermore, we trained a deep neural network to predict chromatin accessibility from only DNA sequence and identified key motifs enriched for each neuronal subtype. Those predicted profiles are visualized together in our database as a valuable resource to explore cell-type specific epigenetic regulation in a sequence-dependent and -independent manner.

Item Type: Paper
Subjects: bioinformatics
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification
bioinformatics > genomics and proteomics > genetics & nucleic acid processing
bioinformatics > genomics and proteomics
organism description > animal
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification > cis-regulatory elements
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > epigenetics
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification > epigenetics
organism description > animal > mammal
organism description > animal > mammal > rodent > mouse
organism description > animal > mammal > rodent
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > protein structure, function, modification > protein types > transcription factor
CSHL Authors:
Communities: CSHL labs > Gillis Lab
CSHL labs > Koo Lab
School of Biological Sciences > Publications
SWORD Depositor: CSHL Elements
Depositing User: CSHL Elements
Date: 22 December 2022
Date Deposited: 04 Jan 2023 17:45
Last Modified: 29 Feb 2024 18:12
PMCID: PMC9851328
URI: https://repository.cshl.edu/id/eprint/40781

Actions (login required)

Administrator's edit/view item Administrator's edit/view item
CSHL HomeAbout CSHLResearchEducationNews & FeaturesCampus & Public EventsCareersGiving