TY - GEN
T1 - AI-Driven Insider Threat Detection using NLP and Anomaly Detection Models for Identifying Malicious Activities in Organizations
AU - Al Said, Nidal
N1 - Publisher Copyright:
© 2026 IEEE.
PY - 2026
Y1 - 2026
N2 - Insider threats, in general, serve as an enormous risk toward an organization which is referred to as a potential financial loss, possible data breach, and the reputation that an organization might suffer. Security measures typically considered are almost obvious failures that can hardly detect them due to their subtlety and context. The work is concerned with constructing an AI based framework for the detection of insider threat using NLP methods and anomaly detection models on the Enron email dataset consisting of about 500,000 real time organizational emails. Unlike any traditional approaches, this proposes to improve threat detection through a convergence of deep learning-based anomaly detection and state-of-the-art NLP techniques. The analysis will also gather deep insights into what was contained in the emails using its various elements, such as sentiment analysis, named entity recognition, and topic modeling, while the following procedures exploit Isolation Forests, Autoencoders, One-Class SVM, LSTM, and GAN-based models to prove the end-goal of the existence of anomalous behavior. The results suggested that GAN-based anomaly detection had the most successful outcomes with an F1 score of 0.86 and AUC-ROC of 0.93, which were significantly higher than other models. It was also established that 5% of emails were real insider threats because behavioral analysis indicated that the activity of employees was high beyond the working hours and unusual Email rates. This type of a combination between NLP and anomaly detection has shown to be effective in detecting the malicious acts within organizations. The article provides a viable and scalable way of how the organizational cybersecurity can be improved by automating the insider threat detection in the enterprise environment.
AB - Insider threats, in general, serve as an enormous risk toward an organization which is referred to as a potential financial loss, possible data breach, and the reputation that an organization might suffer. Security measures typically considered are almost obvious failures that can hardly detect them due to their subtlety and context. The work is concerned with constructing an AI based framework for the detection of insider threat using NLP methods and anomaly detection models on the Enron email dataset consisting of about 500,000 real time organizational emails. Unlike any traditional approaches, this proposes to improve threat detection through a convergence of deep learning-based anomaly detection and state-of-the-art NLP techniques. The analysis will also gather deep insights into what was contained in the emails using its various elements, such as sentiment analysis, named entity recognition, and topic modeling, while the following procedures exploit Isolation Forests, Autoencoders, One-Class SVM, LSTM, and GAN-based models to prove the end-goal of the existence of anomalous behavior. The results suggested that GAN-based anomaly detection had the most successful outcomes with an F1 score of 0.86 and AUC-ROC of 0.93, which were significantly higher than other models. It was also established that 5% of emails were real insider threats because behavioral analysis indicated that the activity of employees was high beyond the working hours and unusual Email rates. This type of a combination between NLP and anomaly detection has shown to be effective in detecting the malicious acts within organizations. The article provides a viable and scalable way of how the organizational cybersecurity can be improved by automating the insider threat detection in the enterprise environment.
KW - Anomaly Detection
KW - Cybersecurity
KW - Deep Learning
KW - Enron Email Dataset
KW - Insider Threat Detection
KW - Natural Language Processing
UR - https://www.scopus.com/pages/publications/105036594678
U2 - 10.1109/ICIPCN67432.2026.11438281
DO - 10.1109/ICIPCN67432.2026.11438281
M3 - Conference contribution
AN - SCOPUS:105036594678
T3 - Proceedings of the 2026 6th International Conference on Image Processing and Capsule Networks, ICIPCN 2026
SP - 1550
EP - 1555
BT - Proceedings of the 2026 6th International Conference on Image Processing and Capsule Networks, ICIPCN 2026
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 6th International Conference on Image Processing and Capsule Networks, ICIPCN 2026
Y2 - 27 January 2026 through 29 January 2026
ER -