Skip to main navigation Skip to search Skip to main content

Early breast cancer diagnostics based on hierarchical machine learning classification for mammography images

  • M. Saeed Darweesh
  • , Mostafa Adel
  • , Ahmed Anwar
  • , Omar Farag
  • , Ahmed Kotb
  • , Mohamed Adel
  • , Ayman Tawfik
  • , Hassan Mostafa
  • Nile University
  • Cairo University
  • German University in Cairo
  • National Institute of Laser Enhanced Sciences
  • Zewail City of Science and Technology

Research output: Contribution to journalArticlepeer-review

28 Scopus citations

Abstract

Breast cancer constitutes a significant threat to women’s health and is considered the second leading cause of their death. Breast cancer is a result of abnormal behavior in the functionality of the normal breast cells. Therefore, breast cells tend to grow uncontrollably, forming a tumor that can be felt like a breast lump. Early diagnosis of breast cancer is proved to reduce the risks of death by providing a better chance of identifying a suitable treatment. Machine learning and artificial intelligence play a key role in healthcare systems by assisting physicians in diagnosing early, better, and treating various diseases. For achieving the early detection of breast cancer, this paper proposes a Machine Learning-based two-level top-down hierarchical approach for breast cancer detection and classification into three classes: normal, benign, and malignant, using the Mammographic Image Analysis Society (MIAS) mammography dataset. Different data preprocessing techniques are applied before using feature extraction techniques and machine learning algorithms for classification. The first classification stage which distinguishes between normal and abnormal cases is comprised of Gray Level Co-occurrence Matrix (GLCM) as a feature extraction technique and random forest as a classifier, followed by the second classification stage which classifies the abnormal cases into benign or malignant cases and is comprised of Local Binary Patterns (LBP) as a feature extraction technique and random forest as a classifier. The classification accuracy for the first stage is 97% and an F1-score of 0.98 and 0.97 for normal and abnormal classes. While for the second stage, the classification accuracy is 75% and an F1-score of 0.76 and 0.74 for benign and malignant classes. The overall hierarchical classification system achieves a classification accuracy of 85%, Matthews correlation coefficient (MCC) of 0.76, and F1-score of 0.98, 0.7, and 0.74 for normal, benign, and malignant test cases.

Original languageEnglish
Article number1968324
JournalCogent Engineering
Volume8
Issue number1
DOIs
StatePublished - 2021

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

Keywords

  • Breast cancer
  • Gray Level Co-Occurrence Matrix (GLCM)
  • Local Binary Patterns (LBP)
  • Machine learning
  • Mammography
  • Random forest

Fingerprint

Dive into the research topics of 'Early breast cancer diagnostics based on hierarchical machine learning classification for mammography images'. Together they form a unique fingerprint.

Cite this