Journal
MOLECULAR INFORMATICS
Volume 30, Issue 6-7, Pages 506-519Publisher
WILEY-V C H VERLAG GMBH
DOI: 10.1002/minf.201100005
Keywords
Text mining; Named entity recognition; Information extraction; Chemical compounds; Drugs
Categories
Funding
- Innovative Medicines Initiative Joint Undertaking [115002]
- Eurocancercoms
- ISCIII combiomed network
Ask authors/readers for more resources
Providing prior knowledge about biological properties of chemicals, such as kinetic values, protein targets, or toxic effects, can facilitate many aspects of drug development. Chemical information is rapidly accumulating in all sorts of free text documents like patents, industry reports, or scientific articles, which has motivated the development of specifically tailored text mining applications. Despite the potential gains, chemical text mining still faces significant challenges. One of the most salient is the recognition of chemical entities mentioned in text. To help practitioners contribute to this area, a good portion of this review is devoted to this issue, and presents the basic concepts and principles underlying the main strategies. The technical details are introduced and accompanied by relevant bibliographic references. Other tasks discussed are retrieving relevant articles, identifying relationships between chemicals and other entities, or determining the chemical structures of chemicals mentioned in text. This review also introduces a number of published applications that can be used to build pipelines in topics like drug side effects, toxicity, and protein-disease-compound network analysis. We conclude the review with an outlook on how we expect the field to evolve, discussing its possibilities and its current limitations.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available