Skip to main navigation Skip to search Skip to main content

Spam email detection using deep learning techniques

  • Jordan University of Science and Technology

Research output: Contribution to journalConference articlepeer-review

161 Scopus citations

Abstract

Unsolicited emails such as phishing and spam emails cost businesses and individuals millions of dollars annually. Several models and techniques to automatically detect spam emails have been introduced and developed yet non showed 100% predicative accuracy. Among all proposed models both machine and deep learning algorithms achieved more success. Natural language processing (NLP) enhanced the models' accuracy. In this work, the effectiveness of word embedding in classifying spam emails is introduced. Pre-trained transformer model BERT (Bidirectional Encoder Representations from Transformers) is fine-tuned to execute the task of detecting spam emails from non-spam (HAM). BERT uses attention layers to take the context of the text into its perspective. Results are compared to a baseline DNN (deep neural network) model that contains a BiLSTM (bidirectional Long Short Term Memory) layer and two stacked Dense layers. In addition results are compared to a set of classic classifiers k-NN (k-nearest neighbors) and NB (Naive Bayes). Two open-source data sets are used, one to train the model and the other to test the persistence and robustness of the model against unseen data. The proposed approach attained the highest accuracy of 98.67% and 98.66% F1 score.

Original languageEnglish
Pages (from-to)853-858
Number of pages6
JournalProcedia Computer Science
Volume184
DOIs
StatePublished - 2021
Externally publishedYes
Event12th International Conference on Ambient Systems, Networks and Technologies, ANT 2021 / 4th International Conference on Emerging Data and Industry 4.0, EDI40 2021 / Affiliated Workshops - Warsaw, Poland
Duration: 23 Mar 202126 Mar 2021

Keywords

  • BERT transformer
  • Cybersecurity
  • Deep learning
  • Spam
  • Word embedding

Fingerprint

Dive into the research topics of 'Spam email detection using deep learning techniques'. Together they form a unique fingerprint.

Cite this