Skip to main navigation Skip to search Skip to main content

A Novel Leukemia Gene Features Extraction and Selection Technique for Robust Type Prediction Using Machine Learning

  • University of Sargodha
  • University of Engineering and Technology Lahore

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

The broad term ‘leukemia’ refers to different types of cancer related to blood cells. Detecting and identifying the specific type of leukemia continues to be a major challenge in the medical field. Diverse machine learning techniques can be vital in analyzing gene expression data from microarray experiments in cancer research related to leukemia. In particular, the Leukemia Gene Expression data from the Curated Microarray Database (CuMiDa) is used here. Microarrays can be challenging in determining expression patterns. In this work, we use Fisher’s linear discriminant analysis, a popular technique for dimensionality reduction, together with a new feature selection approach to predict leukemia using microarray data. Our machine learning model is used to predict five types of leukemia including AML, PBSC CD34, Bone Marrow, and CD34 from the bone marrow. This is achieved by first rescaling the data features. We then use a feature selection technique to obtain the 25 most significant features from the dataset’s 22,283 features, then further reduce the dimension to 5 features only, to reduce computational complexity. These features are then fed into a Fisher’s linear discriminant module and a likelihood-based index for classification. The overall performance of our model was excellent. We examine the results using 2, 4, 5, 6, and 7 selected features. The best classification accuracies are 89.6%, 96.92%, and 96.15%, for 2, 5, and 7 selected features, respectively. Our results outperform the state-of-the-art by about 4%, with an excellent task completion time of less than 100 ms.

Original languageEnglish
Pages (from-to)16845-16863
Number of pages19
JournalArabian Journal for Science and Engineering
Volume49
Issue number12
DOIs
StatePublished - Dec 2024

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

Keywords

  • Dimensionality reduction
  • Gene features extraction
  • Leukemia prediction
  • Linear discriminant analysis

Fingerprint

Dive into the research topics of 'A Novel Leukemia Gene Features Extraction and Selection Technique for Robust Type Prediction Using Machine Learning'. Together they form a unique fingerprint.

Cite this