Collaborative Research: Improving speech technology for better learning outcomes: the case of AAE child speakers |
|
The goal of this project is to develop new spoken language processing technology to enable interactive dialog between children and a virtual agent to support literacy learning and assessment, with a focus on serving underrepresented communities. Many AAE-speaking children struggle with literacy but spoken language systems that could deliver effective interventions are much less effective when used with AAE speakers, as they are seldom included in the samples used to train speech recognition or TTS systems. While our focus is on one dialect (AAE), the goal is to develop methods that can be applied to other dialects, so we focus on the scenario of learning from limited data. Since studies have shown that ASR performance on adult AAE is much worse than that for GAE, and we know that recognizing children’s speech is more difficult than adults, our assessment of the technology impact on learning leverages a constrained dialog task with initial experiments in a Wizard-of-Oz (WoZ) setting. (details) |
|
Voice Source Project |
|
In voiced speech, the vocal folds open and close quasi-periodically and thus convert the glottal air flow (air volume velocity) into a train of flow pulses which is referred to as the voice source excitation signal. Early models of the source signal used a simple impulse train for modeling voiced excitation. None of these models has been calibrated with direct observations of glottal area changes which are the proximal cause of the air pressure changes that we hear as sound.The effective study of the voice source thus requires both more accurate source models and a comprehensive set of underlying observations on which to base the models. The primary goal of the proposed research is to develop and evaluate a new, more powerful source model based on direct observations of vocal fold vibrations... (details) |
|
The Subglottal Resonances: Research and Applications |
↑Top |
During the past few decades, research efforts in the area of speech processing have focused on the extraction of reliable acoustic features for applications such as automatic speech recognition, speaker identification, and speech coding, among many others. These acoustic features are related either to the vocal tract (filter), or to the glottal air flow (source) that drives it. Although the mechanics of the supraglottal (above the glottis) system have been well understood, the subglottal (below the glottis) system and its properties have not been explored in great detail. Unlike the supraglottal tract, the configuration of the subglottal system remains fairly constant during speech production, which makes its properties very interesting and useful. In particular, its resonant frequencies, through subtle interactions with the speech signal, are believed to have the potential to minimize acoustic differences among speakers and also to provide valuable information about a speaker's identity... ( details) An application to estimate your height from your voice: http://ucla-voice-and-height.herokuapp.com/ |
|
|
|
|
|
|
|