TY - GEN
T1 - Comparative Analysis of Malay Vowel Recognition Using MFCC and Formant Features with Logistic Regression and Neural Networks Classifications
AU - Al-Rifai, Osama H.
AU - Yusof, S. A.Mohd
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.
PY - 2026
Y1 - 2026
N2 - Robust vowel recognition is essential for Automatic Speech Recognition (ASR) systems, particularly in low-resource languages like Malay. This study evaluates the classification performance of Malay vowels (/a/, /e/, /o/, /u/, and /i/) using four feature sets: 13 Mel-Frequency Cepstral Coefficients (MFCCs), 33-MFCCs, 13-MFCCs with formants, and 33-MFCCs with formants. Two classifiers, Logistic Regression (LR) and Neural Networks (NNs), were assessed to determine the impact of feature dimensionality and spectral information on recognition accuracy. Results show that Neural Networks consistently outperform Logistic Regression across all feature sets, achieving the highest accuracy of 98.07% with 13-MFCCs and formants. While Logistic Regression performs competitively with simpler feature sets, it struggles with spectral ambiguities in higher-dimensional spaces. These findings emphasize the importance of integrating spectral and formant features for improved Malay vowel classification. This study offers practical insights into feature selection and model design for ASR systems in low-resource languages, paving the way for future research into hybrid and deep learning models for multilingual speech recognition.
AB - Robust vowel recognition is essential for Automatic Speech Recognition (ASR) systems, particularly in low-resource languages like Malay. This study evaluates the classification performance of Malay vowels (/a/, /e/, /o/, /u/, and /i/) using four feature sets: 13 Mel-Frequency Cepstral Coefficients (MFCCs), 33-MFCCs, 13-MFCCs with formants, and 33-MFCCs with formants. Two classifiers, Logistic Regression (LR) and Neural Networks (NNs), were assessed to determine the impact of feature dimensionality and spectral information on recognition accuracy. Results show that Neural Networks consistently outperform Logistic Regression across all feature sets, achieving the highest accuracy of 98.07% with 13-MFCCs and formants. While Logistic Regression performs competitively with simpler feature sets, it struggles with spectral ambiguities in higher-dimensional spaces. These findings emphasize the importance of integrating spectral and formant features for improved Malay vowel classification. This study offers practical insights into feature selection and model design for ASR systems in low-resource languages, paving the way for future research into hybrid and deep learning models for multilingual speech recognition.
KW - Logistic Regression
KW - MFCCs
KW - Neural Networks
KW - Vowel Recognition
UR - https://www.scopus.com/pages/publications/105028252481
U2 - 10.1007/978-3-032-00232-7_60
DO - 10.1007/978-3-032-00232-7_60
M3 - Conference contribution
AN - SCOPUS:105028252481
SN - 9783032002310
T3 - Studies in Computational Intelligence
SP - 965
EP - 982
BT - Selected Papers from the International Conference on Artificial Intelligence - FICAILY2025 - Current Research, Industry Trends, and Innovations
A2 - Albaji, Ali Othman
PB - Springer Science and Business Media Deutschland GmbH
T2 - International Conference on AI: Current Research, Industry Trends, and Innovations, FICAILY 2025
Y2 - 9 July 2025 through 10 July 2025
ER -