Genotyping in the cloud with crossbow

Gurtowski, J., Schatz, M. C., Langmead, B. (September 2012) Genotyping in the cloud with crossbow. Current Protocols in Bioinformatics. 15.3. ISSN 1934-340X (Electronic)1934-3396 (Linking)

Abstract

Crossbow is a scalable, portable, and automatic cloud computing tool for identifying SNPs from high-coverage, short-read resequencing data. It is built on Apache Hadoop, an implementation of the MapReduce software framework. Hadoop allows Crossbow to distribute read alignment and SNP calling subtasks over a cluster of commodity computers. Two robust tools, Bowtie and SOAPsnp, implement the fundamental alignment and variant calling operations respectively, and have demonstrated capabilities within Crossbow of analyzing approximately one billion short reads per hour on a commodity Hadoop cluster with 320 cores. Through protocol examples, this unit will demonstrate the use of Crossbow for identifying variations in three different operating modes: on a Hadoop cluster, on a single computer, and on the Amazon Elastic MapReduce cloud computing service. Curr. Protoc. Bioinform. 39:15.3.1-15.3.15. (c) 2012 by John Wiley & Sons, Inc.

Item Type: Paper
Subjects: bioinformatics
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification
bioinformatics > genomics and proteomics > genetics & nucleic acid processing
bioinformatics > genomics and proteomics
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification > SNP
bioinformatics > genomics and proteomics > computers > computer software
CSHL Authors:
Communities: CSHL labs > Schatz lab
Depositing User: Matt Covey
Date: September 2012
Date Deposited: 31 Jan 2013 17:23
Last Modified: 31 Jan 2013 17:23
PMCID: PMC3465669
Related URLs:
URI: https://repository.cshl.edu/id/eprint/26953

Actions (login required)

Administrator's edit/view item Administrator's edit/view item