The fractured landscape of RNA-seq alignment: the default in our STARs

Ballouz, S., Dobin, A., Gingeras, T. R., Gillis, J. (May 2018) The fractured landscape of RNA-seq alignment: the default in our STARs. Nucleic Acids Res. ISSN 0305-1048

[thumbnail of 2017.Ballouz.lncRNA.GWAS.pdf] PDF
2017.Ballouz.lncRNA.GWAS.pdf - Published Version
Available under License Creative Commons Attribution.

Download (2MB)
URL: https://www.ncbi.nlm.nih.gov/pubmed/29718481
DOI: 10.1093/nar/gky325

Abstract

Many tools are available for RNA-seq alignment and expression quantification, with comparative value being hard to establish. Benchmarking assessments often highlight methods' good performance, but are focused on either model data or fail to explain variation in performance. This leaves us to ask, what is the most meaningful way to assess different alignment choices? And importantly, where is there room for progress? In this work, we explore the answers to these two questions by performing an exhaustive assessment of the STAR aligner. We assess STAR's performance across a range of alignment parameters using common metrics, and then on biologically focused tasks. We find technical metrics such as fraction mapping or expression profile correlation to be uninformative, capturing properties unlikely to have any role in biological discovery. Surprisingly, we find that changes in alignment parameters within a wide range have little impact on both technical and biological performance. Yet, when performance finally does break, it happens in difficult regions, such as X-Y paralogs and MHC genes. We believe improved reporting by developers will help establish where results are likely to be robust or fragile, providing a better baseline to establish where methodological progress can still occur.

Item Type: Paper
Subjects: bioinformatics > genomics and proteomics > alignment
bioinformatics
bioinformatics > genomics and proteomics
Investigative techniques and equipment
bioinformatics > genomics and proteomics > alignment > sequence alignment
bioinformatics > computational biology > algorithms
Investigative techniques and equipment > assays
bioinformatics > computational biology
Investigative techniques and equipment > assays > RNA-seq
CSHL Authors:
Communities: CSHL labs > Gillis Lab
CSHL labs > Gingeras lab
CSHL labs > Dobin Lab
CSHL Cancer Center Program > Cancer Genetics and Genomics Program
CSHL Cancer Center Program > Gene Regulation and Inheritance Program
Depositing User: Matt Covey
Date: 1 May 2018
Date Deposited: 22 May 2018 15:01
Last Modified: 06 Feb 2024 21:01
PMCID: PMC6007662
Related URLs:
URI: https://repository.cshl.edu/id/eprint/36572

Actions (login required)

Administrator's edit/view item Administrator's edit/view item
CSHL HomeAbout CSHLResearchEducationNews & FeaturesCampus & Public EventsCareersGiving