Martin, L., Cook, C., Matasci, N., Williams, J., Bastow, R. (2015) Data mining with iPlant: A meeting report from the 2013 GARNet workshop, Data mining with iPlant. Journal of Experimental Botany, 66 (1). pp. 1-6. ISSN 0022-0957
Abstract
High-throughput sequencing technologies have rapidly moved from large international sequencing centres to individual laboratory benchtops. These changes have driven the 'data deluge' of modern biology. Submissions of nucleotide sequences to GenBank, for example, have doubled in size every year since 1982, and individual data sets now frequently reach terabytes in size. While 'big data' present exciting opportunities for scientific discovery, data analysis skills are not part of the typical wet bench biologist's experience. Knowing what to do with data, how to visualize and analyse them, make predictions, and test hypotheses are important barriers to success. Many researchers also lack adequate capacity to store and share these data, creating further bottlenecks to effective collaboration between groups and institutes. The US National Science Foundation-funded iPlant Collaborative was established in 2008 to form part of the data collection and analysis pipeline and help alleviate the bottlenecks associated with the big data challenge in plant science. Leveraging the power of high-performance computing facilities, iPlant provides free-to-use cyberinfrastructure to enable terabytes of data storage, improve analysis, and facilitate collaborations. To help train UK plant science researchers to use the iPlant platform and understand how it can be exploited to further research, GARNet organized a four-day Data mining with iPlant workshop at Warwick University in September 2013. This report provides an overview of the workshop, and highlights the power of the iPlant environment for lowering barriers to using complex bioinformatics resources, furthering discoveries in plant science research and providing a platform for education and outreach programmes.
Item Type: | Paper |
---|---|
Subjects: | bioinformatics > genomics and proteomics > databases > database optimization bioinformatics > genomics and proteomics > databases > database search and retrieval bioinformatics > genomics and proteomics > datasets Investigative techniques and equipment > assays > next generation sequencing |
CSHL Authors: | |
Communities: | Dolan DNA Learning Center |
Depositing User: | Matt Covey |
Date: | 2015 |
Date Deposited: | 24 Oct 2014 16:55 |
Last Modified: | 24 Apr 2015 19:11 |
Related URLs: | |
URI: | https://repository.cshl.edu/id/eprint/30868 |
Actions (login required)
Administrator's edit/view item |