Benchmarking challenging small variants with linked and long reads

Wagner, J, Olson, ND, Harris, L, Khan, Z, Farek, J, Mahmoud, M, Stankovic, A, Kovacevic, V, Yoo, B, Miller, N, Rosenfeld, JA, Ni, B, Zarate, S, Kirsche, M, Aganezov, S, Schatz, MC, Narzisi, G, Byrska-Bishop, M, Clarke, W, Evani, US, Markello, C, Shafin, K, Zhou, X, Sidow, A, Bansal, V, Ebert, P, Marschall, T, Lansdorp, P, Hanlon, V, Mattsson, CA, Barrio, AM, Fiddes, IT, Xiao, C, Fungtammasan, A, Chin, CS, Wenger, AM, Rowell, WJ, Sedlazeck, FJ, Carroll, A, Salit, M, Zook, JM (May 2022) Benchmarking challenging small variants with linked and long reads. Cell Genomics, 2 (5). p. 100128. ISSN 2666-979X

[thumbnail of 2022_Wagner_Benchmarking_challenging_small_variants_with_linked.pdf]
Preview
PDF
2022_Wagner_Benchmarking_challenging_small_variants_with_linked.pdf - Published Version
Available under License Creative Commons Attribution.

Download (2MB) | Preview

Abstract

Genome in a Bottle benchmarks are widely used to help validate clinical sequencing pipelines and develop variant calling and sequencing methods. Here we use accurate linked and long reads to expand benchmarks in 7 samples to include difficult-to-map regions and segmental duplications that are challenging for short reads. These benchmarks add more than 300,000 SNVs and 50,000 insertions or deletions (indels) and include 16% more exonic variants, many in challenging, clinically relevant genes not covered previously, such as PMS2. For HG002, we include 92% of the autosomal GRCh38 assembly while excluding regions problematic for benchmarking small variants, such as copy number variants, that should not have been in the previous version, which included 85% of GRCh38. It identifies eight times more false negatives in a short read variant call set relative to our previous benchmark. We demonstrate that this benchmark reliably identifies false positives and false negatives across technologies, enabling ongoing methods development.

Item Type: Paper
Subjects: Investigative techniques and equipment
Investigative techniques and equipment > assays
Investigative techniques and equipment > assays > long-read sequencing
CSHL Authors:
Communities: CSHL labs > Schatz lab
SWORD Depositor: CSHL Elements
Depositing User: CSHL Elements
Date: 11 May 2022
Date Deposited: 29 Sep 2023 18:08
Last Modified: 17 Jan 2024 15:55
PMCID: PMC9706577
Related URLs:
URI: https://repository.cshl.edu/id/eprint/41072

Actions (login required)

Administrator's edit/view item Administrator's edit/view item