Accurate and robust inference of genetic ancestry from cancer-derived molecular data across genomic platforms

Belleau, Pascal, Deschênes, Astrid, Tuveson, David, Krasnitz, Alexander (February 2022) Accurate and robust inference of genetic ancestry from cancer-derived molecular data across genomic platforms. BioRxiv. (Unpublished)

[thumbnail of 2022.Belleau.genetic_ancestry.pdf] PDF
2022.Belleau.genetic_ancestry.pdf
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (675kB)
URL: https://www.biorxiv.org/content/10.1101/2022.02.01...
DOI: 10.1101/2022.02.01.478737

Abstract

Genetic ancestry-oriented cancer research requires the ability to perform accurate and robust ancestry inference from existing cancer-derived data, including whole exomes, transcriptomes and targeted gene panels, very often in the absence of matching cancer-free genomic data. In order to optimize and assess the performance of the ancestry inference for any given input cancer-derived molecular profile, we develop a data synthesis framework. In its core procedure, the ancestral background of the profiled patient is replaced with one of any number of individuals with known ancestry. Data synthesis is applicable to multiple profiling platforms and makes it possible to assess the performance of inference separately for each continental-level ancestry. This ability extends to all ancestries, including those without statistically sufficient representation in the existing cancer data. We further show that our inference procedure is accurate and robust in a wide range of sequencing depths. Testing our approach for three representative cancer types, and across three molecular profiling modalities, we demonstrate that global, continental-level ancestry of the patient can be inferred with high accuracy, as quantified by its agreement with the golden standard of the ancestry derived from matching cancer-free molecular data. Our study demonstrates that vast amounts of existing cancer-derived molecular data potentially are amenable to ancestry-oriented studies of the disease, without recourse to matching cancer-free genomes or patients’ self-identification by ancestry.

Item Type: Paper
Subjects: diseases & disorders > cancer
bioinformatics > genomics and proteomics
CSHL Authors:
Communities: CSHL labs > Krasnitz lab
CSHL labs > Tuveson lab
SWORD Depositor: CSHL Elements
Depositing User: CSHL Elements
Date: 4 February 2022
Date Deposited: 07 Apr 2022 14:37
Last Modified: 07 Apr 2022 14:37
URI: https://repository.cshl.edu/id/eprint/40573

Actions (login required)

Administrator's edit/view item Administrator's edit/view item
CSHL HomeAbout CSHLResearchEducationNews & FeaturesCampus & Public EventsCareersGiving