Skip to main navigation Skip to search Skip to main content

Comprehensive study of pre-trained language models: detecting humor in news headlines

  • Jordan University of Science and Technology

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

The ability to automatically understand and analyze human language attracted researchers and practitioners in the Natural Language Processing (NLP) field. Detecting humor is an NLP task needed in many areas, including marketing, politics, and news. However, such a task is challenging due to the context, emotion, culture, and rhythm. To address this problem, we have proposed a robust model called BFHumor, a BERT-Flair-based Humor detection model that detects humor through news headlines. It is an ensemble model of different state-of-the-art pre-trained models utilizing various NLP techniques. We used public humor datasets from the SemEval-2020 workshop to evaluate the proposed model. As a result, the model achieved outstanding performance with 0.51966 as Root Mean Squared Error (RMSE) and 0.62291 as accuracy. In addition, we extensively investigated the underlying reasons behind the high accuracy of the BFHumor model in humor detection tasks. To that end, we conducted two experiments on the BERT model: vocabulary level and linguistic capturing level. Our investigation shows that BERT can capture surface knowledge in the lower layers, syntactic in the middle, and semantic in the higher layers.

Original languageEnglish
Pages (from-to)2575-2599
Number of pages25
JournalSoft Computing
Volume27
Issue number5
DOIs
StatePublished - Mar 2023

Keywords

  • BERT
  • BERT knowledge
  • BERT vocabulary
  • Flair
  • Humor
  • Pre-trained models

Fingerprint

Dive into the research topics of 'Comprehensive study of pre-trained language models: detecting humor in news headlines'. Together they form a unique fingerprint.

Cite this