Skip to main navigation Skip to search Skip to main content

Parcimonious time frequency quantization for phoneme and speaker classification

  • Information and System Sciences Lab.
  • Naval Group

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Some specificities of the speech signal may not well be addressed by the conventional speech processing. In this paper we focuses on a parcimonious representation of speech dynamics. We propose a novel coding strategy based on speech time-frequency quantization (TFQ) using simple Allen temporal interval algebra applied on subband voicing levels. Our compressed speech representation contains only 15 integers for a speech window up to 1 s long. We evaluate the discrimination power of these features for text independent speaker (60 hours, 62 speakers) or vowel recognition (1 hour, 6 vowels) on a referenced radio broadcast news used during evaluation campaign ESTER piloted by french intelligence agency. The 30 TFQ integers code ( feature compression factor (CF) of 26) classifies 62 speakers with an error reduction of 14% relatively to the random classifier, whereas the 390 float voicing features gives similar score. This illustrates the fact that TFQ may modelize co-articulation and speaking style. Preliminary model of independent speaker vowel identification using 15 integers TFQ features (CF of 6,4) gives an error reduction of 15,1% relatively to the random classifier, whereas the 48 float voicing level gives 31%. Further works to improve our parcimonious coding are then discussed.

Original languageEnglish
Title of host publicationIEEE Canadian Conference on Electrical and Computer Engineering, Proceedings, CCECE 2008
Pages1531-1534
Number of pages4
DOIs
StatePublished - 2008
Externally publishedYes
EventIEEE Canadian Conference on Electrical and Computer Engineering, CCECE 2008 - Niagara Falls, ON, Canada
Duration: 4 May 20087 May 2008

Publication series

NameCanadian Conference on Electrical and Computer Engineering
ISSN (Print)0840-7789

Conference

ConferenceIEEE Canadian Conference on Electrical and Computer Engineering, CCECE 2008
Country/TerritoryCanada
CityNiagara Falls, ON
Period4/05/087/05/08

Keywords

  • Quantization
  • Speaker recognition
  • Speech analysis
  • Speech coding
  • Time-frequency

Fingerprint

Dive into the research topics of 'Parcimonious time frequency quantization for phoneme and speaker classification'. Together they form a unique fingerprint.

Cite this