Mapping and characterization of structural variation in 17,795 human genomes.

Abel, Haley J, Larson, David E, Regier, Allison A, Chiang, Colby, Das, Indraniel, Kanchi, Krishna L, Layer, Ryan M, Neale, Benjamin M, Salerno, William J, Reeves, Catherine, Buyske, Steven, NHGRI Centers for Common Disease Genomics, Matise, Tara C, Muzny, Donna M, Zody, Michael C, Lander, Eric S, Dutcher, Susan K, Stitziel, Nathan O, Hall, Ira M (July 2020) Mapping and characterization of structural variation in 17,795 human genomes. Nature, 583 (7814). pp. 83-89. ISSN 0028-0836

[thumbnail of Mapping and characterization of structural variation in 17,795 human genomes.pdf] PDF
Mapping and characterization of structural variation in 17,795 human genomes.pdf

Download (1MB)

Abstract

A key goal of whole-genome sequencing for studies of human genetics is to interrogate all forms of variation, including single-nucleotide variants, small insertion or deletion (indel) variants and structural variants. However, tools and resources for the study of structural variants have lagged behind those for smaller variants. Here we used a scalable pipeline1 to map and characterize structural variants in 17,795 deeply sequenced human genomes. We publicly release site-frequency data to create the largest, to our knowledge, whole-genome-sequencing-based structural variant resource so far. On average, individuals carry 2.9 rare structural variants that alter coding regions; these variants affect the dosage or structure of 4.2 genes and account for 4.0-11.2% of rare high-impact coding alleles. Using a computational model, we estimate that structural variants account for 17.2% of rare alleles genome-wide, with predicted deleterious effects that are equivalent to loss-of-function coding alleles; approximately 90% of such structural variants are noncoding deletions (mean 19.1 per genome). We report 158,991 ultra-rare structural variants and show that 2% of individuals carry ultra-rare megabase-scale structural variants, nearly half of which are balanced or complex rearrangements. Finally, we infer the dosage sensitivity of genes and noncoding elements, and reveal trends that relate to element class and conservation. This work will help to guide the analysis and interpretation of structural variants in the era of whole-genome sequencing.

Item Type: Paper
Subjects: bioinformatics
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification
bioinformatics > genomics and proteomics > genetics & nucleic acid processing
bioinformatics > genomics and proteomics
organism description > animal
Investigative techniques and equipment > assays
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > epigenetics
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification > epigenetics
organism description > animal > mammal
organism description > animal > mammal > rodent > mouse
organism description > animal > mammal > rodent
Investigative techniques and equipment > assays > whole genome sequencing
Communities: CSHL labs > Iossifov lab
CSHL labs > Wigler lab
SWORD Depositor: CSHL Elements
Depositing User: CSHL Elements
Date: July 2020
Date Deposited: 03 May 2021 15:16
Last Modified: 26 Jan 2024 17:07
PMCID: PMC7547914
URI: https://repository.cshl.edu/id/eprint/39970

Actions (login required)

Administrator's edit/view item Administrator's edit/view item