Skip to main navigation Skip to search Skip to main content

Enhanced Neural Speech Recognition of Quranic Recitations via a Large Audio Model †

  • Jordan University of Science and Technology
  • American University of Ras Al Khaimah

Research output: Contribution to journalArticlepeer-review

Abstract

In this work, we build on our recent work toward developing a neural speech recognition (NSR) for Quranic recitations that is accessible to people of any age, gender, or expertise level. The Quran recitations by females and males (QRFAM) dataset, a sizable benchmark dataset of audio recordings made by male and female reciters from various age groups and competence levels, was previously reported in our prior works. In addition to this dataset, we used various subsets of the QRFAM dataset for training, validation, and testing to build several basic NSR systems based on Mozilla’s DeepSpeech model. Our current efforts to optimize and enhance these baseline models have also been presented. In this study, we expand our efforts by utilizing one of the well-known speech recognition models, Whisper, and we describe the effect of this choice on the model’s accuracy, expressed as the word error rate (WER), in comparison to that of DeepSpeech.

Original languageEnglish
Article number9521
JournalApplied Sciences (Switzerland)
Volume15
Issue number17
DOIs
StatePublished - Sep 2025

Keywords

  • DeepSpeech
  • QRFAM dataset
  • automatic speech recognition
  • large audio models

Fingerprint

Dive into the research topics of 'Enhanced Neural Speech Recognition of Quranic Recitations via a Large Audio Model †'. Together they form a unique fingerprint.

Cite this