Concerning the eXclusion in human genomics: the choice of sex chromosome representation in the human genome drastically affects the number of identified variants

Pinto, Brendan J, O'Connor, Brian, Schatz, Michael C, Zarate, Samantha, Wilson, Melissa A (September 2023) Concerning the eXclusion in human genomics: the choice of sex chromosome representation in the human genome drastically affects the number of identified variants. G3: Genes, Genomes, Genetics, 13 (10). jkad169. ISSN 2160-1836

[thumbnail of Concerning the eXclusion in human genomics the choice of sex chromosome representation in the human genome drastically affec.pdf]
Preview
PDF
Concerning the eXclusion in human genomics the choice of sex chromosome representation in the human genome drastically affec.pdf - Published Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (422kB) | Preview

Abstract

Over the past 30 years, a community of scientists has pieced together every base pair of the human reference genome from telomere to telomere. Interestingly, most human genomics studies omit more than 5% of the genome from their analyses. Under "normal" circumstances, omitting any chromosome(s) from an analysis of the human genome would be a cause for concern, with the exception being sex chromosomes. Sex chromosomes in eutherians share an evolutionary origin as an ancestral pair of autosomes. In humans, they share 3 regions of high-sequence identity (∼98-100%), which, along with the unique transmission patterns of the sex chromosomes, introduce technical artifacts in genomic analyses. However, the human X chromosome bears numerous important genes, including more "immune response" genes than any other chromosome, which makes its exclusion irresponsible when sex differences across human diseases are widespread. To better characterize the possible effect of the inclusion/exclusion of the X chromosome on variants called, we conducted a pilot study on the Terra cloud platform to replicate a subset of standard genomic practices using both the CHM13 reference genome and the sex chromosome complement-aware reference genome. We compared the quality of variant calling, expression quantification, and allele-specific expression using these 2 reference genome versions across 50 human samples from the Genotype-Tissue Expression consortium annotated as females. We found that after correction, the whole X chromosome (100%) can generate reliable variant calls, allowing for the inclusion of the whole genome in human genomics analyses as a departure from the status quo of omitting the sex chromosomes from empirical and clinical genomics studies.

Item Type: Paper
Subjects: bioinformatics
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification
bioinformatics > genomics and proteomics > genetics & nucleic acid processing
bioinformatics > genomics and proteomics
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification > chromosome
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification > chromosomes, structure and function > chromosome
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification > chromosomes, structure and function
organism description > animal > mammal > primates > hominids > human
bioinformatics > genomics and proteomics > annotation > variant calling
CSHL Authors:
Communities: CSHL labs > Schatz lab
SWORD Depositor: CSHL Elements
Depositing User: CSHL Elements
Date: 30 September 2023
Date Deposited: 11 Oct 2023 17:27
Last Modified: 10 Jan 2024 20:55
PMCID: PMC10542555
Related URLs:
URI: https://repository.cshl.edu/id/eprint/41197

Actions (login required)

Administrator's edit/view item Administrator's edit/view item