Publications

ROBUSER: A robustness benchmark for Speech Emotion Recognition

Abstract

The recent surge in deep learning has improved Speech Emotion Recognition (SER) model performance; however, ensuring robustness across diverse scenarios beyond the training dataset remains a problem. This challenge becomes pronounced in real-world situations characterized by noisy conditions, where model adaptability to unclean data is crucial. Despite ongoing efforts to develop noise-robust models, the lack of standardized evaluation protocols hampers fair comparisons among different models. This paper tackles this issue by introducing Robuser, a benchmarking procedure designed specifically for evaluating the robustness of SER models under noise. Robuser is a comprehensive open-source benchmark that can be applied to any speech dataset, focusing on diverse corruption types in two pivotal dimensions: additive background noise and various signal distortion corruptions, each in varying …

Date
September 15, 2024
Authors
Antonia Petrogianni, Lefteris Kapelonis, Nikolaos Antoniou, Sofia Eleftheriou, Petros Mitseas, Dimitris Sgouropoulos, Athanasios Katsamanis, Theodoros Giannakopoulos, Shrikanth Narayanan
Conference
2024 12th International Conference on Affective Computing and Intelligent Interaction (ACII)
Pages
219-227
Publisher
IEEE