4.5 Article

Emati: a recommender system for biomedical literature based on supervised learning

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/database/baac104

Keywords

-

Funding

  1. Federal Ministry of Education and Research project Center for Scalable Data Analytics and Artificial Intelligence

Ask authors/readers for more resources

This study develops a web-based article recommender service called Emati using a content-based approach and supervised machine learning models. Two different approaches, including TF-IDF with naive Bayes model and fine-tuning the BERT language model, are implemented. Emati provides updated article recommendations to users and also offers personalized search functionality.
The scientific literature continues to grow at an ever-increasing rate. Considering that thousands of new articles are published every week, it is obvious how challenging it is to keep up with newly published literature on a regular basis. Using a recommender system that improves the user experience in the online environment can be a solution to this problem. In the present study, we aimed to develop a web-based article recommender service, called Emati. Since the data are text-based by nature and we wanted our system to be independent of the number of users, a content-based approach has been adopted in this study. A supervised machine learning model has been proposed to generate article recommendations. Two different supervised learning approaches, namely the naive Bayes model with Term Frequency-Inverse Document Frequency (TF-IDF) vectorizer and the state-of-the-art language model bidirectional encoder representations from transformers (BERT), have been implemented. In the first one, a list of documents is converted into TF-IDF-weighted features and fed into a classifier to distinguish relevant articles from irrelevant ones. Multinomial naive Bayes algorithm is used as a classifier since, along with the class label, it also gives the probability that the input belongs to this class. The second approach is based on fine-tuning the pretrained state-of-the-art language model BERT for the text classification task. Emati provides a weekly updated list of article recommendations and presents it to the user, sorted by probability scores. New article recommendations are also sent to users' email addresses on a weekly basis. Additionally, Emati has a personalized search feature to search online services' (such as PubMed and arXiv) content and have the results sorted by the user's classifier.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available