Imitating manual curation of text-mined facts in biomedicine

Rodriguez-Esteban, R., Iossifov, I., Rzhetsky, A. (September 2006) Imitating manual curation of text-mined facts in biomedicine. PLoS Comput Biol, 2 (9). e118. ISSN 1553-7358 (Electronic)1553-734X (Linking)

[thumbnail of Paper]
Preview
PDF (Paper)
Iossifov PLoS Comp Biol 2006.pdf - Published Version

Download (2MB) | Preview
URL: http://www.ncbi.nlm.nih.gov/pubmed/16965176
DOI: 10.1371/journal.pcbi.0020118

Abstract

Text-mining algorithms make mistakes in extracting facts from natural-language texts. In biomedical applications, which rely on use of text-mined data, it is critical to assess the quality (the probability that the message is correctly extracted) of individual facts--to resolve data conflicts and inconsistencies. Using a large set of almost 100,000 manually produced evaluations (most facts were independently reviewed more than once, producing independent evaluations), we implemented and tested a collection of algorithms that mimic human evaluation of facts provided by an automated information-extraction system. The performance of our best automated classifiers closely approached that of our human evaluators (ROC score close to 0.95). Our hypothesis is that, were we to use a larger number of human experts to evaluate any given sentence, we could implement an artificial-intelligence curator that would perform the classification job at least as accurately as an average individual human evaluator. We illustrated our analysis by visualizing the predicted accuracy of the text-mined relations involving the term cocaine.

Item Type: Paper
Uncontrolled Keywords: Abstracting and Indexing as Topic/ methods Algorithms Artificial Intelligence Automation Biomedical Research Cocaine Computational Biology Computer Simulation Sensitivity and Specificity
Subjects: bioinformatics
bioinformatics > computational biology > algorithms
bioinformatics > computational biology
bioinformatics > computational biology > text mining
CSHL Authors:
Communities: CSHL labs > Iossifov lab
Depositing User: Matt Covey
Date: 8 September 2006
Date Deposited: 01 Apr 2015 19:40
Last Modified: 01 Apr 2015 19:40
PMCID: PMC1560402
Related URLs:
URI: https://repository.cshl.edu/id/eprint/31302

Actions (login required)

Administrator's edit/view item Administrator's edit/view item
CSHL HomeAbout CSHLResearchEducationNews & FeaturesCampus & Public EventsCareersGiving