Application d'un algorithme génétique à la synthèse d'un prétraitement non linéaire pour la segmentation et le regroupement du locuteur
Christophe Charbuillet, Bruno Gas, Mohamed Chetouani, Jean-Luc Zarader.
Speech feature extraction plays a major role in a speaker recognition system. B. Gas & al. showed in [1] that a non linear filtering of speech can improve the feature extractor's ability. In this article we propose to use genetic algorithms to design a non-linear pre-processing of speech adapted to the speaker diarization task. The pre-processing system we present is based on artificial recurrent neural networks (ARNN). We used a genetic algorithm to find both the structure and the weights of the network.
Experiments are carried out using a state-of-the-art speaker diarization system. Results showed that the proposed method give significant improvements, reducing the diarization error rate from 17.38 % to 15.77 %.