Abstract
With the countless advantages gained from the free, open, and ubiquitous nature of online social networks, they do come with their own set of problems and challenges. E.g., they represent a fertile ground for fake accounts and autonomous bots to spread fake news. Revealing whether some text content is written by a bot or a human would be of great value in the fight against the spreading of fake news and misinformation. In this paper, we address this problem using different Machine Learning (ML) techniques: conventional, Deep Learning (DL) based and Transfer Learning (TL) based. Using the dataset of the well-known PAN 2019 Author Profiling Task, we show how relatively simple conventional ML methods can outperform DL and TL based ones for different languages (English and Spanish). In fact, our simplest model performs closely to the state-of-the-art (SOTA) systems for the English language and even outperforms the SOTA systems for the Spanish language.
| Original language | English |
|---|---|
| Journal | Proceedings of the Association for Information Science and Technology |
| Volume | 57 |
| Issue number | 1 |
| DOIs | |
| State | Published - 2020 |
| Externally published | Yes |
Keywords
- BERT
- Bot Identification
- LSTM
- LinearSVC
Fingerprint
Dive into the research topics of 'Authorship analysis of English and Spanish tweets'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver