Un modèle stochastique de compréhension de la parole à 2+1 niveaux
Hélène Bonneau-Maynard, Fabrice Lefèvre.

In this paper an extension is presented for the 2-level stochastic speech understanding model, previously introduced in the context of the Arise} corpus. In the new model, an additional stochastic level is in charge of the attribute value normalization. Due to data sparseness, the full 3-level model is not applicable straightforwardly and a variant is introduced where the conceptual decoding and value normalization phases are decoupled. The proposed approach is evaluated on the French Evalda-Media task (hotel booking and tourist information). This recent corpus has the advantage to be semantically annotated with conceptual segments, which allows for a direct training of the 2-level model. We also present some further model improvements such as the modality propagation or the 2-step hierarchical recomposition. On the whole, the various proposed techniques reduce the understanding error rate from 37.6% to 28.8% on the development set (24% relative improvement). This model has been engaged in the 2005 Media evaluation campaign where it achieved the best results among the 5 participants with an error rate of 29%.