High-throughput annotation of full-length long noncoding RNAs with capture long-read sequencing

Lagarde, J., Uszczynska-Ratajczak, B., Carbonell, S., Perez-Lluch, S., Abad, A., Davis, C., Gingeras, T. R., Frankish, A., Harrow, J., Guigo, R., Johnson, R. (December 2017) High-throughput annotation of full-length long noncoding RNAs with capture long-read sequencing. Nat Genet, 49 (12). pp. 1731-1740. ISSN 1061-4036

URL: https://www.ncbi.nlm.nih.gov/pubmed/29106417
DOI: 10.1038/ng.3988

Abstract

Accurate annotation of genes and their transcripts is a foundation of genomics, but currently no annotation technique combines throughput and accuracy. As a result, reference gene collections remain incomplete-many gene models are fragmentary, and thousands more remain uncataloged, particularly for long noncoding RNAs (lncRNAs). To accelerate lncRNA annotation, the GENCODE consortium has developed RNA Capture Long Seq (CLS), which combines targeted RNA capture with third-generation long-read sequencing. Here we present an experimental reannotation of the GENCODE intergenic lncRNA populations in matched human and mouse tissues that resulted in novel transcript models for 3,574 and 561 gene loci, respectively. CLS approximately doubled the annotated complexity of targeted loci, outperforming existing short-read techniques. Full-length transcript models produced by CLS enabled us to definitively characterize the genomic features of lncRNAs, including promoter and gene structure, and protein-coding potential. Thus, CLS removes a long-standing bottleneck in transcriptome annotation and generates manual-quality full-length transcript models at high-throughput scales.

Item Type: Paper
Subjects: bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification > long non-coding RNA
Investigative techniques and equipment > assays > long-read sequencing
CSHL Authors:
Communities: CSHL Cancer Center Program > Gene Regulation and Cell Proliferation
CSHL labs > Gingeras lab
CSHL Cancer Center Program > Cancer Genetics and Genomics Program
Depositing User: Matt Covey
Date: December 2017
Date Deposited: 15 Nov 2017 16:30
Last Modified: 05 Nov 2020 21:18
PMCID: PMC5709232
Related URLs:
URI: https://repository.cshl.edu/id/eprint/35666

Actions (login required)

Administrator's edit/view item Administrator's edit/view item
CSHL HomeAbout CSHLResearchEducationNews & FeaturesCampus & Public EventsCareersGiving