☆ 4.6 Article

Comprehensive study of pre-trained language models: detecting humor in news headlines

SOFT COMPUTING (2023)

Journal

SOFT COMPUTING

Volume 27, Issue 5, Pages 2575-2599

Publisher

SPRINGER

DOI: 10.1007/s00500-022-07573-z

Keywords

Humor; Pre-trained models; BERT; Flair; BERT knowledge; BERT vocabulary

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

The ability to automatically understand and analyze human language has attracted researchers and practitioners in the Natural Language Processing field. BFHumor, a BERT-Flair-based Humor detection model, successfully detects humor through news headlines and achieved outstanding performance.

The ability to automatically understand and analyze human language attracted researchers and practitioners in the Natural Language Processing (NLP) field. Detecting humor is an NLP task needed in many areas, including marketing, politics, and news. However, such a task is challenging due to the context, emotion, culture, and rhythm. To address this problem, we have proposed a robust model called BFHumor, a BERT-Flair-based Humor detection model that detects humor through news headlines. It is an ensemble model of different state-of-the-art pre-trained models utilizing various NLP techniques. We used public humor datasets from the SemEval-2020 workshop to evaluate the proposed model. As a result, the model achieved outstanding performance with 0.51966 as Root Mean Squared Error (RMSE) and 0.62291 as accuracy. In addition, we extensively investigated the underlying reasons behind the high accuracy of the BFHumor model in humor detection tasks. To that end, we conducted two experiments on the BERT model: vocabulary level and linguistic capturing level. Our investigation shows that BERT can capture surface knowledge in the lower layers, syntactic in the middle, and semantic in the higher layers.

Comprehensive study of pre-trained language models: detecting humor in news headlines

Journal

SOFT COMPUTING

Publisher

SPRINGER

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Comprehensive study of pre-trained language models: detecting humor in news headlines

Journal

SOFT COMPUTING

Publisher

SPRINGER

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper