Reconnaissance audiovisuelle de la parole par VMike
Fabian Brugger, Leila Zouari, Hervé Bredin, Asmaa Amehraye, Dominique Pastor.
This article presents a new Electronic Retina based Smart Microphone
(VMike) and investigates the use of its novel parameters - lip
profiles - in audiovisual speech recognition. In order to evaluate
the parameterization, both an audio only and a video only speech
recognition system are developed and tested. Then, two main fusion
techniques are employed to test the usability of profiles in
audiovisual systems: feature fusion and decision fusion. These
results are compared to the performance of recognizers based on a
state-of-the-art parameterization, and also to results obtained by
applying perceptual filtering to the speech signal prior to
recognition. When feature fusion is applied, and under noisy
conditions, recognition using lip profiles improved by up to 13
percent with respect to audio-only recognition.