4.1 Article

Protein interaction sentence detection using multiple semantic kernels

期刊

JOURNAL OF BIOMEDICAL SEMANTICS
卷 2, 期 -, 页码 -

出版社

BMC
DOI: 10.1186/2041-1480-2-1

关键词

-

资金

  1. Scottish Enterprise PhD studentship
  2. NSF Expeditions in Computing grant on Computational Sustainability [0832782]
  3. EPSRC Advanced Research Fellowship [EP/E052029/1]
  4. EPSRC project [CLIMB EP/F009429/1]
  5. Engineering and Physical Sciences Research Council [EP/E052029/1] Funding Source: researchfish
  6. EPSRC [EP/E052029/1] Funding Source: UKRI
  7. Direct For Computer & Info Scie & Enginr
  8. Division Of Computer and Network Systems [0832782] Funding Source: National Science Foundation

向作者/读者索取更多资源

Background: Detection of sentences that describe protein-protein interactions (PPIs) in biomedical publications is a challenging and unresolved pattern recognition problem. Many state-of-the-art approaches for this task employ kernel classification methods, in particular support vector machines (SVMs). In this work we propose a novel data integration approach that utilises semantic kernels and a kernel classification method that is a probabilistic analogue to SVMs. Semantic kernels are created from statistical information gathered from large amounts of unlabelled text using lexical semantic models. Several semantic kernels are then fused into an overall composite classification space. In this initial study, we use simple features in order to examine whether the use of combinations of kernels constructed using word-based semantic models can improve PPI sentence detection. Results: We show that combinations of semantic kernels lead to statistically significant improvements in recognition rates and receiver operating characteristic (ROC) scores over the plain Gaussian kernel, when applied to a well-known labelled collection of abstracts. The proposed kernel composition method also allows us to automatically infer the most discriminative kernels. Conclusions: The results from this paper indicate that using semantic information from unlabelled text, and combinations of such information, can be valuable for classification of short texts such as PPI sentences. This study, however, is only a first step in evaluation of semantic kernels and probabilistic multiple kernel learning in the context of PPI detection. The method described herein is modular, and can be applied with a variety of feature types, kernels, and semantic models, in order to facilitate full extraction of interacting proteins.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.1
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据