Publications

Formant Tracking by Combining Deep Neural Network and Linear Prediction

Abstract

Formant tracking is an area of speech science that has recently undergone a technology shift from classical model-driven signal processing methods to modern data-driven deep learning methods. In this study, these two domains are combined in formant tracking by refining the formants estimated by a data-driven deep neural network (DNN) with formant estimates given by a model-driven linear prediction (LP) method. In the refinement process, the three lowest formants, initially estimated by the DNN-based method, are frame-wise replaced with local spectral peaks identified by the LP method. The LP-based refinement stage can be seamlessly integrated into the DNN without any training. As an LP method, the study advocates the use of quasi-closed phase forward-backward (QCP-FB) analysis. Three spectral representations are compared as DNN inputs: mel-frequency cepstral coefficients (MFCCs), the …

Date
January 16, 2025
Authors
Sudarsana Reddy Kadiri, Kevin Huang, Christina Hagedorn, Dani Byrd, Paavo Alku, Shrikanth Narayanan
Journal
IEEE Open Journal of Signal Processing
Publisher
IEEE