Variant phasing and haplotypic expression from long-read sequencing in maize

Wang, B., Tseng, E., Baybayan, P., Eng, K., Regulski, M., Jiao, Y., Wang, L., Olson, A., Chougule, K., Buren, P. V., Ware, D. (February 2020) Variant phasing and haplotypic expression from long-read sequencing in maize. Commun Biol, 3 (1). p. 78. ISSN 2399-3642 (Public Dataset)

[thumbnail of s42003-020-0805-8.pdf] PDF
s42003-020-0805-8.pdf - Published Version

Download (2MB)
URL: https://www.ncbi.nlm.nih.gov/pubmed/32071408
DOI: 10.1038/s42003-020-0805-8

Abstract

Haplotype phasing maize genetic variants is important for genome interpretation, population genetic analysis and functional analysis of allelic activity. We performed an isoform-level phasing study using two maize inbred lines and their reciprocal crosses, based on single-molecule, full-length cDNA sequencing. To phase and analyze transcripts between hybrids and parents, we developed IsoPhase. Using this tool, we validated the majority of SNPs called against matching short-read data from embryo, endosperm and root tissues, and identified allele-specific, gene-level and isoform-level differential expression between the inbred parental lines and hybrid offspring. After phasing 6907 genes in the reciprocal hybrids, we annotated the SNPs and identified large-effect genes. In addition, we identified parent-of-origin isoforms, distinct novel isoforms in maize parent and hybrid lines, and imprinted genes from different tissues. Finally, we characterized variation in cis- and trans-regulatory effects. Our study provides measures of haplotypic expression that could increase accuracy in studies of allelic expression.

Item Type: Paper
Subjects: bioinformatics
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification
bioinformatics > genomics and proteomics > genetics & nucleic acid processing
bioinformatics > genomics and proteomics
Investigative techniques and equipment
organism description > plant > maize
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification > genes, structure and function > alleles
Investigative techniques and equipment > assays
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification > genes, structure and function > gene expression
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification > genes, structure and function
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification > single nucleotide polymorphism > haplotype
Investigative techniques and equipment > assays > long-read sequencing
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification > mRNA
organism description > plant
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification > single nucleotide polymorphism
CSHL Authors:
Communities: CSHL labs > Ware lab
Depositing User: Adrian Gomez
Date: 18 February 2020
Date Deposited: 21 Feb 2020 15:40
Last Modified: 01 Feb 2024 19:14
PMCID: PMC7028979
Related URLs:
Dataset ID:
  • The IsoPhase tool developed in this study is available in the GitHub repository: https://github.com/magdoll/cdna_cupcake. The related files generated from the code are stored at Zenodo repository (https://doi.org/10.5281/zenodo.2611319)
  • The data generated in this study, including PacBio Iso-Seq reads and Illumina short reads, have been submitted to ArrayExpress (https://www.ebi.ac.uk/arrayexpress/) under accession numbers E-MTAB-7837 and E-MTAB-7394.
URI: https://repository.cshl.edu/id/eprint/39069

Actions (login required)

Administrator's edit/view item Administrator's edit/view item
CSHL HomeAbout CSHLResearchEducationNews & FeaturesCampus & Public EventsCareersGiving