Reconnaissance robuste de parole en environnement réel à l'aide d'un réseau de microphones à formation de voie adaptative basée sur un critère des N-best Vraisemblances Maximales
Luca Brayda, Christian Wellekens, Maurizio Omologo.
Distant-talking speech recognition in noisy environnements is generally tackled by using a microphone array and a related multi-channel processing.
Based on that framework, this paper proposes an N-best extension of the Limabeam algorithm, that is an adaptive maximum likelihood beamformer.
N-best hypothesized transcriptions are generated at a first recognition step and then optimized independently one to each other.
As a result, the N-best list is re-ranked, which allows selection of the maximally likely transcription to clean speech models.
Results on real data show improvements over both Delay and Sum Beamforming and Unsupervised Limabeam at low SNR and with moderate reverberation.