Skip to main navigation Skip to search Skip to main content

Detecting Malware Families and Subfamilies using Machine Learning Algorithms: An Empirical Study

  • Jordan University of Science and Technology

Research output: Contribution to journalArticlepeer-review

23 Scopus citations

Abstract

Machine learning algorithms have proved their effectiveness in detecting malware. This paper conducts an empirical study to demonstrate the effectiveness of selected machine learning algorithms in detecting and classifying Android malware using permissions features. The used dataset consists of 9000 different malicious applications from the CIC-Maldroid2020, CIC-Maldroid2017 and CIC-InvesAndMal2019 datasets collected by the Canadian Institute for Cybersecurity. Meta-Multiclass and Random Forest ensemble classifiers are used based on different machine learning classifiers to overcome the imbalance in the data classes. Moreover, a genetic attribute selection technique and SMOTE are used to classify Ransomware sub-families to handle the small size of the dataset and underfitting problem. The results show that optimization and ensemble approaches are successful in treating dataset issues, with 95% accuracy in classifying big malware families and 80% in Ransomware subfamilies.

Original languageEnglish
Pages (from-to)761-765
Number of pages5
JournalInternational Journal of Advanced Computer Science and Applications
Volume13
Issue number2
DOIs
StatePublished - 2022

Keywords

  • Information security
  • Machine learning
  • Malware classification
  • Smot

Fingerprint

Dive into the research topics of 'Detecting Malware Families and Subfamilies using Machine Learning Algorithms: An Empirical Study'. Together they form a unique fingerprint.

Cite this