Journal
BIOINFORMATICS
Volume 32, Issue 3, Pages 432-440Publisher
OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btv585
Keywords
-
Categories
Funding
- Commonwealth Scholarship Commission
- Cambridge Trust
- Vinnova
- MRC grant [MR/M013049/1]
- MRC [MR/M013049/1] Funding Source: UKRI
- Medical Research Council [MR/M013049/1] Funding Source: researchfish
Ask authors/readers for more resources
Motivation: The hallmarks of cancer have become highly influential in cancer research. They reduce the complexity of cancer into 10 principles (e.g. resisting cell death and sustaining proliferative signaling) that explain the biological capabilities acquired during the development of human tumors. Since new research depends crucially on existing knowledge, technology for semantic classification of scientific literature according to the hallmarks of cancer could greatly support literature review, knowledge discovery and applications in cancer research. Results: We present the first step toward the development of such technology. We introduce a corpus of 1499 PubMed abstracts annotated according to the scientific evidence they provide for the 10 currently known hallmarks of cancer. We use this corpus to train a system that classifies PubMed literature according to the hallmarks. The system uses supervised machine learning and rich features largely based on biomedical text mining. We report good performance in both intrinsic and extrinsic evaluations, demonstrating both the accuracy of the methodology and its potential in supporting practical cancer research. We discuss how this approach could be developed and applied further in the future.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available