Skip to main navigation Skip to search Skip to main content

Neural Arabic text diacritization: State of the art results and a novel approach for machine translation

  • Jordan University of Science and Technology

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

19 Scopus citations

Abstract

In this work, we present several deep learning models for the automatic diacritization of Arabic text. Our models are built using two main approaches, viz. Feed-Forward Neural Network (FFNN) and Recurrent Neural Network (RNN), with several enhancements such as 100-hot encoding, embeddings, Conditional Random Field (CRF) and Block-Normalized Gradient (BNG). The models are tested on the only freely available benchmark dataset and the results show that our models are either better or on par with other models, which require language-dependent post-processing steps, unlike ours. Moreover, we show that diacritics in Arabic can be used to enhance the models of NLP tasks such as Machine Translation (MT) by proposing the Translation over Diacritization (ToD) approach.

Original languageEnglish
Title of host publicationWAT@EMNLP-IJCNLP 2019 - 6th Workshop on Asian Translation, Proceedings
PublisherAssociation for Computational Linguistics (ACL)
Pages215-225
Number of pages11
ISBN (Electronic)9781950737871
StatePublished - 2021
Externally publishedYes
Event6th Workshop on Asian Translation, WAT@EMNLP-IJCNLP 2019 - Hong Kong, China
Duration: 4 Nov 2019 → …

Publication series

NameWAT@EMNLP-IJCNLP 2019 - 6th Workshop on Asian Translation, Proceedings

Conference

Conference6th Workshop on Asian Translation, WAT@EMNLP-IJCNLP 2019
Country/TerritoryChina
CityHong Kong
Period4/11/19 → …

Fingerprint

Dive into the research topics of 'Neural Arabic text diacritization: State of the art results and a novel approach for machine translation'. Together they form a unique fingerprint.

Cite this