Skip to main navigation Skip to search Skip to main content

A NOVEL FEDERATED-LEARNING BASED ADVERSARIAL FRAMEWORK FOR AUDIOVISUAL SPEECH ENHANCEMENT

  • Mohammed Amin Almaiah
  • , Aitizaz Ali
  • , Rima Shishakly
  • , Tayseer Alkhdour
  • , Abdalwali Lutfi
  • , Mahmaod Alrawad
  • Aqaba University of Technology
  • Applied Science Private University
  • University of Jordan
  • UNITAR International University
  • King Faisal University

Research output: Contribution to journalArticlepeer-review

10 Scopus citations

Abstract

Current speech enhancement (SE) techniques operate in the spectral domain, utilizing either edge computing or the cloud. Most existing frameworks offer solutions for a limited number of noise conditions and rely on first-order statistics. To address these limitations, researchers have explored machine learning approaches to learn complex functions and train large datasets. However, these models typically rely on centralized servers like the cloud, which raises security concerns. Furthermore, running such training models on edge devices is challenging due to their limited battery power and privacy issues. In this study, we propose a federated learning-based SE framework for multiple clients, using two speakers, to overcome these challenges. Our proposed framework offers a decentralized model that allows for both local and global training of data. Moreover, it is well-suited for adversarial networks and private clinics as it preserves privacy on edge devices and in the cloud, facilitating SE in a distributed fashion. The proposed model enables multiple clients to train their data independently and send the aggregated training model to the cloud. In contrast to existing approaches, our method operates at the waveform level, training the model end-to-end and incorporating two speakers with different noise conditions into a single model. This allows for sharing model parameters with multiple clients using federated learning. Our approach provides improved security, speed, and reduced battery usage for various clients using hearing aids, resulting in enhanced robustness and other speech-centric design choices to improve speech quality securely.

Original languageEnglish
Pages (from-to)1683-1693
Number of pages11
JournalJournal of Theoretical and Applied Information Technology
Volume102
Issue number4
StatePublished - 29 Feb 2024

Keywords

  • Cloud computing
  • Deep learning AV dataset
  • Federated Learning
  • SDG
  • Speech Enhancement

Fingerprint

Dive into the research topics of 'A NOVEL FEDERATED-LEARNING BASED ADVERSARIAL FRAMEWORK FOR AUDIOVISUAL SPEECH ENHANCEMENT'. Together they form a unique fingerprint.

Cite this