Skip to main navigation Skip to search Skip to main content

Scalable multi-label Arabic text classification

  • Jordan University of Science and Technology

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

35 Scopus citations

Abstract

Multi-label text classification (MTC) is a natural extension of the traditional text classification (TC) in which a possibly large set of labels can be assigned to each document. The dimensionality of labels makes MTC difficult and challenging. Several ways are proposed to ease the classification process and one of them is called the problem transformation (PT) method. It is used to transform the multi-labeled data into a single-label one that is suitable for normal classification. Our paper presents a detailed study about using the supervised approach to address the MTC problem for Arabic text. Moreover, the scalability of such an approach is considered in our experiments. The MEKA system is used to convert the multi-label data into a single-label one using different PT methods: LC, BR and RT. Then, different classifiers commonly used for TC such as SVM, NB, KNN, and Decision Tree, are applied to each dataset. The results show that using SVM on the LC dataset generated the best results with 71% ML-accuracy.

Original languageEnglish
Title of host publication2015 6th International Conference on Information and Communication Systems, ICICS 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages212-217
Number of pages6
ISBN (Electronic)9781479973491
DOIs
StatePublished - 6 May 2015
Externally publishedYes
Event6th International Conference on Information and Communication Systems, ICICS 2015 - Amman, Jordan
Duration: 7 Apr 20159 Apr 2015

Publication series

Name2015 6th International Conference on Information and Communication Systems, ICICS 2015

Conference

Conference6th International Conference on Information and Communication Systems, ICICS 2015
Country/TerritoryJordan
CityAmman
Period7/04/159/04/15

Keywords

  • Exact match
  • Hamming loss
  • MEKA
  • Multi-label classification
  • Problem transformation methods
  • Scalability

Fingerprint

Dive into the research topics of 'Scalable multi-label Arabic text classification'. Together they form a unique fingerprint.

Cite this