Samovar: Single-Sample Mosaic Single-Nucleotide Variant Calling with Linked Reads

Darby, C. A., Fitch, J. R., Brennan, P. J., Kelly, B. J., Bir, N., Magrini, V., Leonard, J., Cottrell, C. E., Gastier-Foster, J. M., Wilson, R. K., Mardis, E. R., White, P., Langmead, B., Schatz, M. C. (August 2019) Samovar: Single-Sample Mosaic Single-Nucleotide Variant Calling with Linked Reads. iScience, 18 (Specia). pp. 1-10. ISSN 2589-0042 (Public Dataset)

[thumbnail of 1-s2.0-S2589004219301749-main.pdf]
Preview
PDF
1-s2.0-S2589004219301749-main.pdf - Published Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (2MB) | Preview
URL: https://www.ncbi.nlm.nih.gov/pubmed/31271967
DOI: 10.1016/j.isci.2019.05.037

Abstract

Linked-read sequencing enables greatly improves haplotype assembly over standard paired-end analysis. The detection of mosaic single-nucleotide variants benefits from haplotype assembly when the model is informed by the mapping between constituent reads and linked reads. Samovar evaluates haplotype-discordant reads identified through linked-read sequencing, thus enabling phasing and mosaic variant detection across the entire genome. Samovar trains a random forest model to score candidate sites using a dataset that considers read quality, phasing, and linked-read characteristics. Samovar calls mosaic single-nucleotide variants (SNVs) within a single sample with accuracy comparable with what previously required trios or matched tumor/normal pairs and outperforms single-sample mosaic variant callers at minor allele frequency 5%-50% with at least 30X coverage. Samovar finds somatic variants in both tumor and normal whole-genome sequencing from 13 pediatric cancer cases that can be corroborated with high recall with whole exome sequencing. Samovar is available open-source at https://github.com/cdarby/samovar under the MIT license.

Item Type: Paper
Subjects: bioinformatics > genomics and proteomics
bioinformatics > genomics and proteomics > annotation > variant calling
CSHL Authors:
Communities: CSHL labs > Schatz lab
Depositing User: Matthew Dunn
Date: 30 August 2019
Date Deposited: 29 Jul 2019 15:14
Last Modified: 29 Jun 2021 19:11
PMCID: PMC6609817
Related URLs:
Dataset ID:
  • https://github.com/cdarby/samovar
  • http://labshare.cshl.edu/shares/schatzlab/www-data/samovar/simulation/
URI: https://repository.cshl.edu/id/eprint/38144

Actions (login required)

Administrator's edit/view item Administrator's edit/view item
CSHL HomeAbout CSHLResearchEducationNews & FeaturesCampus & Public EventsCareersGiving