|
12thAnnual Conference of the
International Speech Communication Association
|
sponsors
|
Interspeech 2011 Florence |
Technical Programme
This is the final programme for this session. For oral sessions, the timing on the left is the current presentation order, but this may still change, so please check at the conference itself. If you have signed in to My Schedule, you can add papers to your own personalised list.
Mon-Ses1-O2: Speech Production - Articulatory Measurements
Time: | Monday 10:00 |
Place: | Leonardo - Pala Affari - Ground Floor |
Type: | Oral |
Chair: | Paavo Alku |
10:00 | Visualization of vocal tract shape using interleaved real-time MRI of multiple scan planes
Yoon-Chul Kim (University of Southern California) Michael I. Proctor (University of Southern California) Shrikanth S. Narayanan (University of Southern California) Krishna S. Nayak (University of Southern California)
Conventional real-time magnetic resonance imaging (RT-MRI) of the upper airway typically acquires information about the vocal tract from a single midsagittal scan plane. This provides insights into the dynamics of all articulators, but does not allow for visualization of several important features in vocal tract shaping, such as grooving/doming of the tongue, asymmetries in tongue shape, and lateral shaping of the pharyngeal airway. In this paper, we present an approach to RT-MRI of multiple scan planes of interest using time-interleaved acquisition, in which temporal resolution is compromised for greater spatial coverage. We demonstrate simultaneous visualization of vocal tract dynamics from midsagittal, coronal, and axial scan planes in the articulation of English fricatives.
|
10:20 | Biomechanical Tongue Models: An Approach to Studying Inter-speaker Variability
Ralf Winkler (ZAS, Berlin, Germany) Susanne Fuchs (ZAS, Berlin, Germany) Pascal Perrier (DPC/GIPSA-lab, Grenoble-INP, CNRS, Grenoble, France) Mark Tiede (Haskins Labs, New Haven, CT, USA and R.L.E.-MIT, Boston, MA, USA)
Speakers of a given language vary with respect to their acoustics, articulation, and motor commands. This variation is driven by a variety of influences, such as emotional states, communicative interaction, and individual properties of the vocal tract. In this work we focus on the latter. First, we build speaker-specific biomechanical tongue models. Second, we discuss the impact of the relative position of the bending in the vocal tract on the basis of extensive simulations with two different models. We focus on /i,a,u/ by defining target regions in the acoustic space, and discuss the corresponding speaker-specific articulatory and motor command variability observed.
|
10:40 | Quantifying Articulatory Distinctiveness of Vowels
Jun Wang (University of Nebraska - Lincoln) Jordan R. Green (University of Nebraska - Lincoln) Ashok Samal (University of Nebraska - Lincoln) David B. Marx (University of Nebraska - Lincoln)
The articulatory distinctiveness among vowels has been frequently characterized descriptively based on tongue height and front-back position; however, very few empirical methods have been proposed to characterize vowels based on time-varying articulatory characteristics. Such information is not only needed to improve knowledge about the articulation of vowels but also to determine the contribution of articulatory imprecision to poor speech intelligibility. In this paper, a novel statistical shape analysis was used to derive a vowel space that depicted the quantified articulatory distinctiveness among vowels based on tongue and lip movements. The effectiveness of the approach was supported by vowel classification accuracy of up to 91.7%. The theoretical relevance and clinical implication of the derived vowel space were discussed.
|
11:00 | Direct Estimation of Articulatory Kinematics from Real-time Magnetic Resonance Image Sequences
Michael Proctor (University of Southern California) Adam Lammert (University of Southern California) Athanasios Katsamanis (University of Southern California) Louis Goldstein (University of Southern California) Christina Hagedorn (University of Southern California) Shrikanth Narayanan (University of Southern California)
A method of rapid, automatic extraction of consonantal articulatory trajectories from real-time magnetic resonance image sequences is described. Constriction location targets are estimated by identifying regions of maximally-dynamic correlated pixel activity along the palate, the alveolar ridge, and at the lips. Tissue movement into and out of the constriction location is estimated by calculating the change in mean pixel intensity in a circle located at the center of the region of interest. Closure and release gesture timings are estimated from landmarks in the velocity profile derived from the smoothed intensity function. We demonstrate the utility of the technique in the analysis of Italian intervocalic consonant production.
|
11:20 | Combined optical distance sensing and electropalatography to measure articulation
Peter Birkholz (Clinic for Phoniatrics, Pedaudiology, and Communication Disorders, University Hospital Aachen and RWTH Aachen University) Christiane Neuschaefer-Rube (Clinic for Phoniatrics, Pedaudiology, and Communication Disorders, University Hospital Aachen and RWTH Aachen University)
We present the first prototype of a new optoelectronic instrument for the combined real-time measurement of the tongue contour in the mid-sagittal plane, the contact pattern between the tongue and the palate, and the position of the lips. The instrument consists of a thin acrylic pseudopalate with embedded contact sensors, as for electropalatography, and optical distance sensors to measure tongue-palate distances, as for glossometry. One additional distance sensor is located at the anterior side of the upper incisors to register the degree of opening and protrusion of the lips. together, the sensors provide complementary information about the articulation of vowels and consonants, which was verified in initial experiments. The instrument offers new perspectives for the study of normal and disordered speech production, as well as for silent speech interfaces and speech prostheses for laryngectomees.
|
11:40 | Simulating Post-L F0 Bouncing by Modeling Articulatory Dynamics
Santitham Prom-on (University College London) Yi Xu (University College London) Fang Liu (Stanford University)
Post-L F0 bouncing (post-L bouncing for short) is a prosodic phenomenon whereby F0 is temporarily raised following a very low pitch. The phenomenon is quite robust, but is not widely known, and it has never been computationally modeled. This paper presents the results of our simulation of the phenomenon by modeling articulatory dynamics. Using the quantitative Target Approximation (qTA) model, we were able to simulate the F0 rise after the Mandarin L tone by adding an acceleration adjustment to the initial state of the first post-L Neutral tone. Furthermore, a linear relationship was found between the added acceleration and the amount of F0 lowering in the L tone. We interpreted the results as evidence that post-L bouncing is directly related to the articulatory mechanism of producing a very low pitch.
|
|
|