4.7 Article

An empirical study on the effect of data sparsity and data overlap on cross domain collaborative filtering performance

Journal

EXPERT SYSTEMS WITH APPLICATIONS
Volume 89, Issue -, Pages 254-265

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.eswa.2017.07.041

Keywords

Cross-domain recommender; Collaborative filtering; E-commerce application; Data sparsity

Funding

  1. New Faculty Research Grant of Keimyung University [20150149, 20160020]

Ask authors/readers for more resources

In the present day, the oversaturation of data has complicated the process of finding information from a data source. Recommender systems aim to alleviate this problem in various domains by actively suggesting selective information to potential users based on their personal preferences. Amongst these approaches, collaborative filtering based recommenders (CF recommenders), which make use of users' implicit and explicit ratings for items, are widely regarded as the most successful type of recommender system. However, CF recommenders are sensitive to issues caused by data sparsity, where users rate very few items, or items receive very few ratings from users, meaning there is not enough data to give a recommendation. The majority of studies have attempted to solve these issues by focusing on developing new algorithms within a single domain. Recently, cross-domain recommenders that use multiple domain datasets have attracted increasing attention amongst the research community. Cross-domain recommenders assume that users who express their preferences in one domain (called the target domain) will also express their preferences in another domain (called the source domain), and that these additional preferences will improve precision and recall of recommendations to the user. The purpose of this study is to investigate the effects of various data sparsity and data overlap issues on the performance of cross-domain CF recommenders, using various aggregation functions. In this study, several different cross domain recommenders were created by collecting three datasets from three separate domains of a large Korean fashion company and combining them with different algorithms and different aggregation approaches. The cross-recommenders that used high performance, high overlap domains showed significant improvement of precision and recall of recommendation when the recommendation scores of individual domains were combined using the summation aggregation function. However, the cross-recommenders that used low performance, low overlap domains showed little or no performance improvement in all areas. This result implies that the use of cross-domain recommenders do not guarantee performance improvement, rather that it is necessary to consider relevant factors carefully to achieve performance improvement when using cross-domain recommenders. (C) 2017 Elsevier Ltd. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available