TY - GEN
T1 - Sarcasm Detection and Quantification in Arabic Tweets
AU - Talafha, Bashar
AU - Za'ter, Muhy Eddin
AU - Suleiman, Samer
AU - Al-Ayyoub, Mahmoud
AU - Al-Kabi, Mohammed N.
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - The role of predicting sarcasm in the text is known as automatic sarcasm detection. Given the prevalence and challenges of sarcasm in sentiment-bearing text, this is a critical phase in most sentiment analysis tasks. With the increasing popularity and usage of different social media platforms among users around the world, people are using sarcasm more and more in their day-to-day conversations, social media posts and tweets, and it is considered as a way for people to express their sentiment about some certain topics or issues. As a result of the increasing popularity, researchers started to focus their research endeavors on detecting sarcasm from a text in different languages especially the English language. However, the task of sarcasm detection is a challenging task due to the nature of sarcastic texts; which can be relative and significantly differs from one person to another depending on the topic, region, the user's mentality and other factors. In addition to the aforementioned challenges, sarcasm detection in the Arabic language has its own challenges due to the complexity of the Arabic language, such as being morphologically rich, with many dialects that significantly vary between each other, while also being lowly resourced when compared to English. In recent years, only few research attempts started tackling the task of sarcasm detection in Arabic, including creating and collecting corpora, organizing workshops and establishing baseline models. This paper intends to create a new humanly annotated Arabic corpus for sarcasm detection collected from tweets, and implementing a new approach for sarcasm detection and quantification in Arabic tweets. The annotation technique followed in this paper is unique in sarcasm detection and the proposed approach tackles the problem as a regression problem instead of classification; i.e., the model attempts to predict the level of sarcasm instead of binary classification (sarcastic vs. non-sarcastic) for the purpose of tackling the complex and user-dependent nature of the sarcastic text. The humanly annotated dataset will be available to the public for any usage.
AB - The role of predicting sarcasm in the text is known as automatic sarcasm detection. Given the prevalence and challenges of sarcasm in sentiment-bearing text, this is a critical phase in most sentiment analysis tasks. With the increasing popularity and usage of different social media platforms among users around the world, people are using sarcasm more and more in their day-to-day conversations, social media posts and tweets, and it is considered as a way for people to express their sentiment about some certain topics or issues. As a result of the increasing popularity, researchers started to focus their research endeavors on detecting sarcasm from a text in different languages especially the English language. However, the task of sarcasm detection is a challenging task due to the nature of sarcastic texts; which can be relative and significantly differs from one person to another depending on the topic, region, the user's mentality and other factors. In addition to the aforementioned challenges, sarcasm detection in the Arabic language has its own challenges due to the complexity of the Arabic language, such as being morphologically rich, with many dialects that significantly vary between each other, while also being lowly resourced when compared to English. In recent years, only few research attempts started tackling the task of sarcasm detection in Arabic, including creating and collecting corpora, organizing workshops and establishing baseline models. This paper intends to create a new humanly annotated Arabic corpus for sarcasm detection collected from tweets, and implementing a new approach for sarcasm detection and quantification in Arabic tweets. The annotation technique followed in this paper is unique in sarcasm detection and the proposed approach tackles the problem as a regression problem instead of classification; i.e., the model attempts to predict the level of sarcasm instead of binary classification (sarcastic vs. non-sarcastic) for the purpose of tackling the complex and user-dependent nature of the sarcastic text. The humanly annotated dataset will be available to the public for any usage.
KW - Arabic Bert
KW - Arabic Sarcasm Detection
KW - Sarcasm
UR - https://www.scopus.com/pages/publications/85123933872
U2 - 10.1109/ICTAI52525.2021.00177
DO - 10.1109/ICTAI52525.2021.00177
M3 - Conference contribution
AN - SCOPUS:85123933872
T3 - Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI
SP - 1121
EP - 1125
BT - Proceedings - 2021 IEEE 33rd International Conference on Tools with Artificial Intelligence, ICTAI 2021
PB - IEEE Computer Society
T2 - 33rd IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2021
Y2 - 1 November 2021 through 3 November 2021
ER -