SplitMEM: A graphical algorithm for pan-genome analysis with suffix skips

Marcus, S., Lee, H., Schatz, M. C. (November 2014) SplitMEM: A graphical algorithm for pan-genome analysis with suffix skips. Bioinformatics, 30 (24). pp. 3476-3483. ISSN 1367-4803

URL: http://www.ncbi.nlm.nih.gov/pubmed/25398610
DOI: 10.1093/bioinformatics/btu756

Abstract

MOTIVATION: Genomics is expanding from a single reference per species paradigm into a more comprehensive pan-genome approach that analyzes multiple individuals together. A compressed de Bruijn graph is a sophisticated data structure for representing the genomes of entire populations. It robustly encodes shared segments, simple SNPs, and complex structural variations far beyond what can be represented in a collection of linear sequences alone. RESULTS: We explore deep topological relationships between suffix trees and compressed de Bruijn graphs and introduce an algorithm, splitMEM, that directly constructs the compressed de Bruijn graph in time and space linear to the total number of genomes for a given maximum genome size. We introduce suffix skips to traverse several suffix links simultaneously, and use them to efficiently decompose maximal exact matches (MEMs) into graph nodes. We demonstrate the utility of splitMEM by analyzing the 9-strain pan-genome of Bacillus anthracis and up to 62 strains of Escherichia coli, revealing their core-genome properties. Availability: Source code and documentation available open-source http://splitmem.sourceforge.net CONTACT: mschatz@cshl.edu.

Item Type: Paper
Subjects: bioinformatics
bioinformatics > genomics and proteomics > computers
bioinformatics > computational biology
bioinformatics > genomics and proteomics > computers > computer software
CSHL Authors:
Communities: CSHL labs > Schatz lab
CSHL Cancer Center Program > Cancer Genetics
Depositing User: Matt Covey
Date: 13 November 2014
Date Deposited: 21 Nov 2014 16:18
Last Modified: 14 Oct 2015 19:16
PMCID: PMC4253837
Related URLs:
URI: http://repository.cshl.edu/id/eprint/30922

Actions (login required)

Administrator's edit/view item Administrator's edit/view item
CSHL HomeAbout CSHLResearchEducationNews & FeaturesCampus & Public EventsCareersGiving