Proposition d'une nouvelle méthodologie pour la sélection automatique du vocabulaire d'un système de reconnaissance automatique de la parole
Brigitte Bigi.

The vocabulary of an Automatic Speech Recognition (ASR) system is a significant factor indetermining its performance. The goal of vocabulary selection is to construct a vocabulary with exactly those words that are the most likely to appear in the test data. This paper proposes a new measure to evaluate the quality of a vocabulary regarding a domain-specific ASR application. This Q_alpha-measure is based on the trade off between the target lexical coverage and vocabulary size. Experiments were carried out on French Broadcast News Transcriptions using the Q_alpha-measure compared to the state-of-the-art method. Results of these two methods favor systematically the proposed methodology.