Rodriguez-Esteban, R., Iossifov, I., Rzhetsky, A. (September 2006) Imitating manual curation of text-mined facts in biomedicine. PLoS Comput Biol, 2 (9). e118. ISSN 1553-7358 (Electronic)1553-734X (Linking)
Preview |
PDF (Paper)
Iossifov PLoS Comp Biol 2006.pdf - Published Version Download (2MB) | Preview |
Abstract
Text-mining algorithms make mistakes in extracting facts from natural-language texts. In biomedical applications, which rely on use of text-mined data, it is critical to assess the quality (the probability that the message is correctly extracted) of individual facts--to resolve data conflicts and inconsistencies. Using a large set of almost 100,000 manually produced evaluations (most facts were independently reviewed more than once, producing independent evaluations), we implemented and tested a collection of algorithms that mimic human evaluation of facts provided by an automated information-extraction system. The performance of our best automated classifiers closely approached that of our human evaluators (ROC score close to 0.95). Our hypothesis is that, were we to use a larger number of human experts to evaluate any given sentence, we could implement an artificial-intelligence curator that would perform the classification job at least as accurately as an average individual human evaluator. We illustrated our analysis by visualizing the predicted accuracy of the text-mined relations involving the term cocaine.
Item Type: | Paper |
---|---|
Uncontrolled Keywords: | Abstracting and Indexing as Topic/ methods Algorithms Artificial Intelligence Automation Biomedical Research Cocaine Computational Biology Computer Simulation Sensitivity and Specificity |
Subjects: | bioinformatics bioinformatics > computational biology > algorithms bioinformatics > computational biology bioinformatics > computational biology > text mining |
CSHL Authors: | |
Communities: | CSHL labs > Iossifov lab |
Depositing User: | Matt Covey |
Date: | 8 September 2006 |
Date Deposited: | 01 Apr 2015 19:40 |
Last Modified: | 01 Apr 2015 19:40 |
PMCID: | PMC1560402 |
Related URLs: | |
URI: | https://repository.cshl.edu/id/eprint/31302 |
Actions (login required)
Administrator's edit/view item |