Deng, Cecilia H, Naithani, Sushma, Kumari, Sunita, Cobo-Simón, Irene, Quezada-Rodríguez, Elsa H, Skrabisova, Maria, Gladman, Nick, Correll, Melanie J, Sikiru, Akeem Babatunde, Afuwape, Olusola O, Marrano, Annarita, Rebollo, Ines, Zhang, Wentao, Jung, Sook (December 2023) Genotype and phenotype data standardization, utilization and integration in the big data era for agricultural sciences. Database : the journal of biological databases and curation, 2023. ISSN 1758-0463
Preview |
PDF
2023_Deng_Genotype_and_phenotype_data_standardization.pdf - Published Version Available under License Creative Commons Attribution. Download (2MB) | Preview |
Abstract
Large-scale genotype and phenotype data have been increasingly generated to identify genetic markers, understand gene function and evolution and facilitate genomic selection. These datasets hold immense value for both current and future studies, as they are vital for crop breeding, yield improvement and overall agricultural sustainability. However, integrating these datasets from heterogeneous sources presents significant challenges and hinders their effective utilization. We established the Genotype-Phenotype Working Group in November 2021 as a part of the AgBioData Consortium (https://www.agbiodata.org) to review current data types and resources that support archiving, analysis and visualization of genotype and phenotype data to understand the needs and challenges of the plant genomic research community. For 2021-22, we identified different types of datasets and examined metadata annotations related to experimental design/methods/sample collection, etc. Furthermore, we thoroughly reviewed publicly funded repositories for raw and processed data as well as secondary databases and knowledgebases that enable the integration of heterogeneous data in the context of the genome browser, pathway networks and tissue-specific gene expression. Based on our survey, we recommend a need for (i) additional infrastructural support for archiving many new data types, (ii) development of community standards for data annotation and formatting, (iii) resources for biocuration and (iv) analysis and visualization tools to connect genotype data with phenotype data to enhance knowledge synthesis and to foster translational research. Although this paper only covers the data and resources relevant to the plant research community, we expect that similar issues and needs are shared by researchers working on animals. Database URL: https://www.agbiodata.org.
Item Type: | Paper |
---|---|
Subjects: | bioinformatics bioinformatics > genomics and proteomics organism description > plant |
CSHL Authors: | |
Communities: | CSHL labs > Ware lab |
SWORD Depositor: | CSHL Elements |
Depositing User: | CSHL Elements |
Date: | 11 December 2023 |
Date Deposited: | 20 Dec 2023 19:01 |
Last Modified: | 08 Jan 2024 19:02 |
PMCID: | PMC10712715 |
Related URLs: | |
URI: | https://repository.cshl.edu/id/eprint/41336 |
Actions (login required)
Administrator's edit/view item |