Comprehensive genome and transcriptome structural analysis of a breast cancer cell line using single molecule sequencing

Nattestad, Maria, Ng, Karen, Goodwin, Sara, Baslan, Timour, Sedlazeck, Fritz, Gurtowski, James, Hutton, Elizabeth, Sundaravadanam, Yogi, Garvin, Tyler, Alford, Marley, Tseng, Elizabeth, Rescheneder, Philipp, Chin, Jason, Beck, Timothy, Kramer, Melissa, McPherson, John, Hicks, James, Schatz, Michael C, McCombie, William R (2016) Comprehensive genome and transcriptome structural analysis of a breast cancer cell line using single molecule sequencing. In: Proceedings of the 107th Annual Meeting of the American Association for Cancer Research, 2016 Apr 16-20, New Orleans, LA.

Abstract

Genomic instability is one of the hallmarks of cancer, leading to widespread copy number variations, chromosomal fusions, and other structural variations in many cancers. The breast cancer cell line SK-BR-3 is an important model for HER2+ breast cancers, which are among the most aggressive forms of the disease and affect one in five cases. Through short read sequencing, copy number arrays, and other technologies, the genome of SK-BR-3 is known to be highly rearranged with many copy number variations, including an approximately twenty-fold amplification of the HER2 oncogene, along with numerous other amplifications and deletions. However, these technologies cannot precisely characterize the nature and context of the identified genomic events and other important mutations may be missed altogether because of repeats, multi-mapping reads, and the failure to reliably anchor alignments to both sides of a variation.</jats:p> <jats:p>To address these challenges, we have sequenced SK-BR-3 using PacBio long read technology. Using the new P6-C4 chemistry, we generated more than 70X coverage of the genome with average read lengths of 9-13kb (max: 71kb). Using Lumpy for split-read alignment analysis, as well as our novel assembly-based algorithms for finding complex variants, we have developed a detailed map of structural variations in this cell line. Taking advantage of the newly identified breakpoints and combining these with copy number assignments, we have developed an algorithm to reconstruct the mutational history of this cancer genome. From this we have characterized the amplifications of the HER2 region, discovering a complex series of nested duplications and translocations between chr17 and chr8, two of the most frequent translocation partners in primary breast cancers. We have also carried out full-length transcriptome sequencing using PacBio's Iso-Seq technology, which has revealed a number of previously unrecognized gene fusions and isoforms. Combining long-read genome and transcriptome sequencing technologies enables an in-depth analysis of how changes in the genome affect the transcriptome, including how gene fusions are created across multiple chromosomes. This analysis has established the most complete cancer reference genome available to date, and is already opening the door to applying long-read sequencing to patient samples with complex genome structures.

Item Type: Conference or Workshop Item (Speech)
Subjects: bioinformatics
diseases & disorders > cancer
diseases & disorders
bioinformatics > genomics and proteomics > genetics & nucleic acid processing
bioinformatics > genomics and proteomics
diseases & disorders > cancer > cancer types > breast cancer
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > transcriptomes
diseases & disorders > cancer > cancer types
CSHL Authors:
Communities: CSHL labs > Krasnitz lab
CSHL labs > McCombie lab
CSHL labs > Schatz lab
CSHL labs > Siepel lab
SWORD Depositor: CSHL Elements
Depositing User: CSHL Elements
Date: 2016
Date Deposited: 22 Jan 2024 16:49
Last Modified: 22 Jan 2024 16:49
Related URLs:
URI: https://repository.cshl.edu/id/eprint/41406

Actions (login required)

Administrator's edit/view item Administrator's edit/view item