期刊
BMC BIOINFORMATICS
卷 8, 期 -, 页码 -出版社
BMC
DOI: 10.1186/1471-2105-8-423
关键词
-
类别
资金
- Intramural NIH HHS Funding Source: Medline
Background: We present a probabilistic topic-based model for content similarity called pmra that underlies the related article search feature in PubMed. Whether or not a document is about a particular topic is computed from term frequencies, modeled as Poisson distributions. Unlike previous probabilistic retrieval models, we do not attempt to estimate relevance-but rather our focus is relatedness, the probability that a user would want to examine a particular document given known interest in another. We also describe a novel technique for estimating parameters that does not require human relevance judgments; instead, the process is based on the existence of MeSH (R) in MEDLINE (R). Results: The pmra retrieval model was compared against bm25, a competitive probabilistic model that shares theoretical similarities. Experiments using the test collection from the TREC 2005 genomics track shows a small but statistically significant improvement of pmra over bm25 in terms of precision. Conclusion: Our experiments suggest that the pmra model provides an effective ranking algorithm for related article search.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据