Accurate inference of genetic ancestry from cancer-derived data

Belleau, Pascal, Deschenes, Astrid, Tuveson, David A, Krasnitz, Alexander (2022) Accurate inference of genetic ancestry from cancer-derived data. In: American Association for Cancer Research Annual Meeting 2022, 2022 Apr 8-13, Philadelphia, PA.

Abstract

Introduction: For multiple cancer types, epidemiological data exhibit strong correlations between, on the one hand, the incidence of the disease, its severity when diagnosed, and its clinical outcome, and, on the other hand, the ancestral background of the patient. Recent studies point to genetic and phenotypic differences between tumors in patient populations with differing genetic ancestries, and to the need for more data collection to power research in this area. We sought to facilitate such analysis by developing computational tools for genetic ancestry inference from cancer-derived molecular data, without the need for the patient’s cancer-free genotype or self-declared race or ethnicity. The ability to perform such inference accurately would unlock vast amounts of such data for ancestry-oriented studies of cancer from two major sources. One is the body of data stored by similar massive digital repositories. The other is the body of archival tumor tissues from which molecular data may be generated. Methods: We developed methods for genetic ancestry inference from cancer-derived whole exomes, transcriptomes and targeted gene panels, in the absence of matching cancer-free genomic data. These are adaptive, endowed with the ability to optimize their performance for each input cancer-derived molecular profile. As a result, these inference methods perform consistently, and with quantifiable accuracy, across a range of profiling depths and qualities, and mitigate cancer-related damage to the genome, such as somatic copy-number variation. Results: We examined the performance of these tools with molecular data from three cancer types: pancreatic and ovarian cancers as representative of epithelial tumor types, and acute myeloid leukemia as an example of hematopoietic malignancy. Three molecular data types were considered: whole-exome sequences, exome sequences targeting a panel of cancer-related genes and RNA sequences. The inference accuracy was found to be consistently above 97% across the three cancers and the three data types. Conclusion: Our study demonstrates the feasibility of accurate inference of genetic ancestry from cancer-derived data, with no need for matching cancer-free genotypes. Computational tools for this purpose will be made available as open-source software.

Item Type: Conference or Workshop Item (Poster)
Subjects: diseases & disorders > cancer
CSHL Authors:
Communities: CSHL labs > Krasnitz lab
CSHL labs > Tuveson lab
SWORD Depositor: CSHL Elements
Depositing User: CSHL Elements
Date: 2022
Date Deposited: 28 Sep 2023 19:17
Last Modified: 28 Sep 2023 19:17
Related URLs:
URI: https://repository.cshl.edu/id/eprint/41045

Actions (login required)

Administrator's edit/view item Administrator's edit/view item