☆ 4.7 Article

Robust scientific text classification using prompt tuning based on data augmentation with L2 regularization

INFORMATION PROCESSING & MANAGEMENT (2024)

Journal

INFORMATION PROCESSING & MANAGEMENT

Volume 61, Issue 1, Pages -

Publisher

ELSEVIER SCI LTD

DOI: 10.1016/j.ipm.2023.103531

Keywords

Scientific text classification; Pre-training model; Prompt tuning; Data augmentation; Pairwise training; L2 regularization

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This research proposes a method to enhance prompt tuning using data augmentation with L2 regularization and demonstrates significant improvements in the accuracy and robustness of language models on two scientific text datasets.

Recently, the prompt tuning technique, which incorporates prompts into the input of the pretraining language model (like BERT, GPT), has shown promise in improving the performance of language models when facing limited annotated data. However, the equivalence of template semantics in learning is not related to the effect of prompts and the prompt tuning often exhibits unstable performance, which is more severe in the domain of the scientific domain. To address this challenge, we propose to enhance prompt tuning using data augmentation with L2 regularization. Namely, pairing-wise training for the pair of the original and transformed data is performed. Our experiments on two scientific text datasets (ACL-ARC and SciCite) demonstrate that our proposed method significantly improves both accuracy and robustness. By using 1000 samples out of 1688 in the ACL-ARC training set, our method achieved an F1 score 3.33% higher than the same model trained on all 1688-sample data. In the SciCite dataset, our method surpassed the same model with labeled data reduced by over 93%. Our method is also proved to have high robustness, reaching F1 scores from 1% to 8% higher than those models without our method after the Probability Weighted Word Saliency attack.

Robust scientific text classification using prompt tuning based on data augmentation with L2 regularization

Journal

INFORMATION PROCESSING & MANAGEMENT

Publisher

ELSEVIER SCI LTD

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Robust scientific text classification using prompt tuning based on data augmentation with L2 regularization

Journal

INFORMATION PROCESSING & MANAGEMENT

Publisher

ELSEVIER SCI LTD

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper