3.8 Proceedings Paper

Diversifying Neural Dialogue Generation via Negative Distillation

Publisher

ASSOC COMPUTATIONAL LINGUISTICS-ACL

Keywords

-

Funding

  1. Beijing Natural Science Foundation [4222037, L181010]
  2. National Natural Science Foundation of China [61972035]

Ask authors/readers for more resources

The paper introduces a novel negative training paradigm called negative distillation to address the generic response problem in generative dialogue models. By introducing a negative teacher model and requiring the student model to maximize the distance with multi-level negative knowledge, the method significantly improves upon previous negative training approaches.
Generative dialogue models suffer badly from the generic response problem, limiting their applications to a few toy scenarios. Recently, an interesting approach, namely negative training, has been proposed to alleviate this problem by reminding the model not to generate high-frequency responses during training. However, its performance is hindered by two issues, ignoring low-frequency but generic responses and bringing low-frequency but meaningless responses. In this paper, we propose a novel negative training paradigm, called negative distillation, to keep the model away from the undesirable generic responses while avoiding the above problems. First, we introduce a negative teacher model that can produce query-wise generic responses, and then the student model is required to maximize the distance with multi-level negative knowledge. Empirical results show that our method outperforms previous negative training methods significantly.(1)

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available