Publications : Information Sciences Institute

Towards large-scale cross-speaker articulatory modeling of vowels

Abstract

While previous studies have attempted to decompose vowel articulatory data into a set of basis factors, these studies have often been limited in both scale and the data being sparsely sampled, limiting interpretability and generalizability of the results (Nix et al. 1996 and Serrurier et al. 2019). In this study, the data were analyzed from 36 (23F, 13M) American English speakers producing 13 vowels in bVt sequences obtained using real-time MRI. Midsagittal tongue contours were obtained during vowel productions for all speakers using a semi-automated segmentation algorithm (Jain et al. 2024). Frames corresponding to the vowel articulation were segmented using MFA and simultaneously recorded audio. A combination of Procrustes analysis for cross-speaker normalization and guided PCA were employed to decompose the pooled articulatory space into a set of vowel “primitives.” 71% of the variation within the …

Date: October 1, 2024
Authors: Sean Foley, Shrikanth Narayanan
Journal: The Journal of the Acoustical Society of America
Volume: 156
Issue: 4_Supplement
Pages: A49-A49
Publisher: Acoustical Society of America