Skip to main navigation Skip to search Skip to main content

A robust endpoint detection of speech for noisy environments with application to automatic speech recognition

  • Conexant

Research output: Contribution to journalConference articlepeer-review

39 Scopus citations

Abstract

We propose a new approach for classifying speech vs. non-speech, which proves to significantly improve speech recognition performance under noise. The proposed algorithm relies on the energy and spectral characteristics of the signal and applies a 3-level two-dimensional thresholds to determine whether an input frame in speech or non-speech. The algorithm runs in real-time, and offers better immunity to background noise, and to background speech than traditional energy-based word boundary detection. The performance of the endpoint detector is reported here in terms of improvements in speaker-independent (SI) and speaker-dependent (SD) recognition performance using 5 different simulated noise conditions and various signal-to-noise ratios (SNR). The proposed endpoint detection of speech improves the SD recognition accuracy by 24% for office noise, and reduces the false rejection rates for both SI and SD by 45% for babble noise and lobby noise.

Original languageEnglish
Pages (from-to)IV/3808-IV/3811
JournalProceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing
Volume4
DOIs
StatePublished - 2002
Externally publishedYes
Event2002 IEEE International Conference on Acoustic, Speech, and Signal Processing - Orlando, FL, United States
Duration: 13 May 200217 May 2002

Fingerprint

Dive into the research topics of 'A robust endpoint detection of speech for noisy environments with application to automatic speech recognition'. Together they form a unique fingerprint.

Cite this