3.8 Proceedings Paper

Counterfactual Data-Augmented Sequential Recommendation

Publisher

ASSOC COMPUTING MACHINERY
DOI: 10.1145/3404835.3462855

Keywords

Recommendation System; Counterfactual Data Augmentation

Funding

  1. Beijing Outstanding Young Scientist Program [BJJWZYJH012019100020098]
  2. National Natural Science Foundation of China [61832017]

Ask authors/readers for more resources

This paper proposes a novel counterfactual data augmentation framework to address the issue of poor performance of sequential recommendation models due to the sparsity of real-world data. The framework consists of a sampler model and an anchor model, which work together to improve the quality of training data by generating new user behavior sequences. Experimental results demonstrate the effectiveness and generality of the proposed framework.
Sequential recommendation aims at predicting users' preferences based on their historical behaviors. However, this recommendation strategy may not perform well in practice due to the sparsity of the real-world data. In this paper, we propose a novel counterfactual data augmentation framework to mitigate the impact of the imperfect training data and empower sequential recommendation models. Our framework is composed of a sampler model and an anchor model. The sampler model aims to generate new user behavior sequences based on the observed ones, while the anchor model is leveraged to provide the final recommendation list, which is trained based on both observed and generated sequences. We design the sampler model to answer the key counterfactual question: what would a user like to buy if her previously purchased items had been different?. Beyond heuristic intervention methods, we leverage two learning-based methods to implement the sampler model, and thus, improve the quality of the generated sequences when training the anchor model. Additionally, we analyze the influence of the generated sequences on the anchor model in theory and achieve a trade-off between the information and the noise introduced by the generated sequences. Experiments on nine real-world datasets demonstrate our framework's effectiveness and generality.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available