Une nouvelle approche fondée sur les ondelettes pour la discrimination parole/musique
Emmanuel Didiot, Irina Illina, Odile Mella, Dominique Fohr, Jean-Paul Haton.
The problem of Speech/Music discrimination is a challenging research problem which significantly impacts Automatic Speech Recognition (ASR) performance. This paper proposes new features for the Speech/Music discrimination task. We use a decomposition of the audio signal based on wavelets which allows a good analysis of non stationary signals like speech or music. We compute different energy types in each frequency band obtained from wavelet decomposition.
We use two Class/Non-Class classifiers : one for speech/non speech, one for music/non music.
On a broadcast corpus, using the proposed wavelet approach, we obtained a significant improvement (35%) compared to MFCC parameters.