Journal
BMC BIOINFORMATICS
Volume 8, Issue -, Pages -Publisher
BMC
DOI: 10.1186/1471-2105-8-423
Keywords
-
Categories
Funding
- Intramural NIH HHS Funding Source: Medline
Ask authors/readers for more resources
Background: We present a probabilistic topic-based model for content similarity called pmra that underlies the related article search feature in PubMed. Whether or not a document is about a particular topic is computed from term frequencies, modeled as Poisson distributions. Unlike previous probabilistic retrieval models, we do not attempt to estimate relevance-but rather our focus is relatedness, the probability that a user would want to examine a particular document given known interest in another. We also describe a novel technique for estimating parameters that does not require human relevance judgments; instead, the process is based on the existence of MeSH (R) in MEDLINE (R). Results: The pmra retrieval model was compared against bm25, a competitive probabilistic model that shares theoretical similarities. Experiments using the test collection from the TREC 2005 genomics track shows a small but statistically significant improvement of pmra over bm25 in terms of precision. Conclusion: Our experiments suggest that the pmra model provides an effective ranking algorithm for related article search.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available