Gillis, J., Pavlidis, P. (2013) Data from "Assessing identity, redundancy and confounds in Gene Ontology annotations over time". [Dataset]
Plain Text (HIPPIE PPIN - The protein interaction data used in sections 3.3 and 3.4. )
HIPPIE_used_current_Jan30_2012.txt - Published Version Available under License Creative Commons Attribution. Download (1MB) |
|
Plain Text (frac_confound_GO_103.txt - Each GO group's confoundedness for our final data point for GO. These data are plotted in Figure 3A. "NaN" occurs where there was division by zero.)
frac_confound_GO_103.txt - Published Version Available under License Creative Commons Attribution. Download (59kB) |
|
Plain Text (frac_confound_con_103.txt - Number of functions shared by gene pairs from the PPIN, and the number of functions confounded for our final data point for GO (edition 103). These data are plotted in Figure 3B.)
frac_confound_con_103.txt - Published Version Available under License Creative Commons Attribution. Download (1MB) |
|
Plain Text (frac_confound_aved.txt - The connection-level data plotted in figure 5A.)
frac_confound_con_aved.txt - Published Version Available under License Creative Commons Attribution. Download (501B) |
|
Plain Text (frac_confound_GO_aved.txt -The GO-term-level data plotted in figure 5A.)
frac_confound_GO_aved.txt - Published Version Available under License Creative Commons Attribution. Download (500B) |
|
Plain Text (Confound table List of GO IDs and Pubmed IDs of papers contributing the most confound edges for those functions)
confound_table.txt - Published Version Available under License Creative Commons Attribution. Download (37kB) |
|
Plain Text (Semantic stability table List of genes and number of GO editions since they changed their functional identity (measured as the highest semantic similarity with itself))
semantic_stability.txt - Published Version Available under License Creative Commons Attribution. Download (70kB) |
|
Plain Text (Semantic similarity table Similarity ranking for each gene back through each edition of GO. A value of "1" means the gene was "most similar to itself" or tied for first.)
semantic_similarity.txt - Published Version Available under License Creative Commons Attribution. Download (2MB) |
|
Plain Text (Multifunctionality rankings table List of gene multifunctionality rankings over time. Useful if there's interest to reduce the annotation bias in GO)
multifunctionality_rankings.txt - Published Version Download (2MB) |
Abstract
The Gene Ontology (GO) is heavily used in systems biology but the potential for redundancy, confounds with other data sources and problems with stability over time have been little explored. We report that GO annotations are stable over short periods with 3% of genes not being most semantically similar to themselves between monthly GO editions. However, we find that genes can alter their "functional identity" over time, with 20% of genes not matching to themselves (by semantic similarity) after two years. We further find that annotation bias in GO, in which some genes are more characterized than others, has declined in yeast, but generally risen in humans. Finally, we discovered that many entries in protein interaction databases are due to the same published reports that are used for GO annotations with 66% of assessed GO groups exhibiting this confound. We provide a case study to illustrate how this information can be used in analyses of gene sets and networks. The following files for human genes are intended to assist researchers who wish to check their own data for the types of effects we report in the paper. The files are tab-delimited. Genes are referenced by NCBI IDs or official symbols, and publications by PubMed IDs.
Item Type: | Dataset |
---|---|
Subjects: | bioinformatics > genomics and proteomics > annotation bioinformatics bioinformatics > genomics and proteomics > annotation > gene expression profiling annotation bioinformatics > genomics and proteomics bioinformatics > genomics and proteomics > Mapping and Rendering > ontology |
CSHL Authors: | |
Communities: | CSHL labs > Gillis Lab |
Depositing User: | Matt Covey |
Date: | 2013 |
Date Deposited: | 29 Apr 2013 14:06 |
Last Modified: | 29 Apr 2013 14:06 |
Related URLs: | |
URI: | https://repository.cshl.edu/id/eprint/28272 |
Actions (login required)
Administrator's edit/view item |