☆ 4.7 Article

A Primer on Contrastive Pretraining in Language Processing: Methods, Lessons Learned, and Perspectives

ACM COMPUTING SURVEYS (2023)

Journal

ACM COMPUTING SURVEYS

Volume 55, Issue 10, Pages -

Publisher

ASSOC COMPUTING MACHINERY

DOI: 10.1145/3561970

Keywords

Contrastive learning

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

Modern NLP methods use self-supervised pretraining objectives, such as masked language modeling, to improve downstream task performance. Contrastive self-supervised training objectives have been successful in image representation pretraining, but in NLP, a single token augmentation can cause issues. This primer summarizes recent self-supervised and supervised contrastive NLP pretraining methods and their applications in language modeling, zero to few-shot learning, pretraining data-efficiency, and specific NLP tasks.

Modern natural language processing (NLP) methods employ self-supervised pretraining objectives such as masked language modeling to boost the performance of various downstream tasks. These pretraining methods are frequently extended with recurrence, adversarial, or linguistic propertymasking. Recently, contrastive self-supervised training objectives have enabled successes in image representation pretraining by learning to contrast input-input pairs of augmented images as either similar or dissimilar. In NLP however, a single token augmentation can invert the meaning of a sentence during input-input contrastive learning, which led to input-output contrastive approaches that avoid the issue by instead contrasting over input-label pairs. In this primer, we summarize recent self-supervised and supervised contrastive NLP pretraining methods and describe where they are used to improve language modeling, zero to few-shot learning, pretraining data-efficiency, and specific NLP tasks. We overview key contrastive learning concepts with lessons learned from prior research and structure works by applications. Finally, we point to open challenges and future directions for contrastive NLP to encourage bringing contrastive NLP pretraining closer to recent successes in image representation pretraining.

A Primer on Contrastive Pretraining in Language Processing: Methods, Lessons Learned, and Perspectives

Journal

ACM COMPUTING SURVEYS

Publisher

ASSOC COMPUTING MACHINERY

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

A Primer on Contrastive Pretraining in Language Processing: Methods, Lessons Learned, and Perspectives

Journal

ACM COMPUTING SURVEYS

Publisher

ASSOC COMPUTING MACHINERY

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper