Skip to main navigation Skip to search Skip to main content

Ontology-concepts weighting for enhanced semantic classification of documents

  • Al Ahliyya Amman University

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

Automatic document classification has become increasingly important and difficult due to the large scale of the electronic documents used in the last years. Traditional information retrieval systems are based on the extraction of keywords from documents; these keywords serve as a basis for documents classification. This paper proposes a new semantic approach for documents classification. Specifically, our approach captures, in addition to the keywords frequency, the meaning of these keywords in documents using domain ontology. The main idea is to represent documents by concepts rather than keywords, and calculates weights for these concepts to reflect their importance in the documents where they appear. The presence of concepts in the same paragraph, section, document, or document set, provides important information to better extract and understand the semantic content of the document and therefore improves its classification. The experimental evaluation is carried out using the Reuters document collection RCV1-v2 and the GALEN medical ontology. The documents are classified using the SVM classifier. The experimental results demonstrate that the proposed approach yields higher accuracy, precision and recall compared to the traditional keyword-based information retrieval approaches.

Original languageEnglish
Pages (from-to)519-531
Number of pages13
JournalInternational Journal of Innovative Computing, Information and Control
Volume12
Issue number2
StatePublished - 2016
Externally publishedYes

Keywords

  • Concept semantic weighting
  • Documents classification
  • Domain ontology
  • Information extraction
  • Information retrieval

Fingerprint

Dive into the research topics of 'Ontology-concepts weighting for enhanced semantic classification of documents'. Together they form a unique fingerprint.

Cite this