Proposition d'une nouvelle méthodologie pour la sélection automatique du vocabulaire d'un système de reconnaissance automatique de la parole
The vocabulary of an Automatic Speech Recognition (ASR) system
is a significant factor indetermining its performance.
The goal of vocabulary selection is to construct a vocabulary
with exactly those words that are the most likely to appear
in the test data.
This paper proposes a new measure to evaluate the quality of
a vocabulary regarding a domain-specific ASR application.
This Q_alpha-measure is based on the trade off between the
target lexical coverage and vocabulary size. Experiments were
carried out on French Broadcast News Transcriptions using the
Q_alpha-measure compared to the state-of-the-art method.
Results of these two methods favor systematically the proposed