Publications
ROBUSER: A robustness benchmark for Speech Emotion Recognition
Abstract
The recent surge in deep learning has improved Speech Emotion Recognition (SER) model performance; however, ensuring robustness across diverse scenarios beyond the training dataset remains a problem. This challenge becomes pronounced in real-world situations characterized by noisy conditions, where model adaptability to unclean data is crucial. Despite ongoing efforts to develop noise-robust models, the lack of standardized evaluation protocols hampers fair comparisons among different models. This paper tackles this issue by introducing Robuser, a benchmarking procedure designed specifically for evaluating the robustness of SER models under noise. Robuser is a comprehensive open-source benchmark that can be applied to any speech dataset, focusing on diverse corruption types in two pivotal dimensions: additive background noise and various signal distortion corruptions, each in varying …
- Date
- September 15, 2024
- Authors
- Antonia Petrogianni, Lefteris Kapelonis, Nikolaos Antoniou, Sofia Eleftheriou, Petros Mitseas, Dimitris Sgouropoulos, Athanasios Katsamanis, Theodoros Giannakopoulos, Shrikanth Narayanan
- Conference
- 2024 12th International Conference on Affective Computing and Intelligent Interaction (ACII)
- Pages
- 219-227
- Publisher
- IEEE