|
12thAnnual Conference of the
International Speech Communication Association
|
sponsors
|
Interspeech 2011 Florence |
Technical Programme
This is the final programme for this session. For oral sessions, the timing on the left is the current presentation order, but this may still change, so please check at the conference itself. If you have signed in to My Schedule, you can add papers to your own personalised list.
Wed-Ses1-O2: Prosody I
Time: | Wednesday 10:00 |
Place: | Leonardo - Pala Affari - Ground Floor |
Type: | Oral |
Chair: | Gérard Bailly |
10:00 | A quantitative investigation of the prosody of Verum Focus in Italian
Giuseppina Turco (Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands) Michele Gubian (Centre for Language & Speech Technology, Radboud University, Nijmegen, The Netherlands) Jessamyn Schertz (Centre for Language & Speech Technology, Radboud University, Nijmegen, The Netherlands)
In this study we present a preliminary investigation of the prosodic marking of Verum focus (VF) in Italian, which is said to be realized with a pitch accent on the finite verb (e.g. A: Paul has not eaten the banana - B: (No), Paul HAS eaten the banana!). We tried to discover whether and how Italian speakers prosodically mark VF when producing full-fledged sentences using a semi-spontaneous production experiment on 27 speakers. Speech rate and f0 contours were extracted using automatic data processing tools and were subsequently analysed using Functional Data Analysis (FDA), which allowed for automatic visualization of patterns in the contour shapes. Our results show that the postfocal region of VF sentences exhibit faster speech rate and lower f0 compared to non-VF cases. However, an expected consistent difference of f0 effect on the focal region of the VF sentence was not found in this analysis.
|
10:20 | Effects of focus on f0 and duration in Irish (Gaelic) declaratives
Amelie Dorn (Trinity College Dublin) Ailbhe Ní Chasaide (Trinity College Dublin)
This pilot study investigates the effects of focus (broad, narrow and contrastive) on tonal patterns, f0 scaling and duration of accented syllables and rhythmic feet for a controlled dataset in Donegal Irish. Results show differences in pre-focal tonal patterns between broad focus and the other focus types. Narrow and contrastive focus renditions are implemented by largely the same phonetic means. Focused domains are overall longer in duration and have wider f0 excursion than broad focus. Durational differences depend on sentence position.
|
10:40 | The phonology and phonetics of perceived prosody: What do listeners imitate?
Jennifer Cole (University of Illinois at Urbana-Champaign) Stefanie Shattuck-Hufnagel (Massachusetts Institute of Technology)
An imitation experiment tests the hypothesis that when asked to reproduce a spontaneously-spoken utterance that they hear, speakers imitate the prosody of the stimulus in its phonological structure more accurately than the phonetic details. Results suggest that speakers rarely distort the presence of a pitch accent or an intonational phrase boundary, but more often change the nature of the phonetic cues, e.g. the duration of a pause or the occurrence of irregular pitch periods associated with boundaries and accents in American English. These findings argue for an encoding of phonological prosodic structure that is separate from the phonetic cues that signal that structure.
|
11:00 | Uncovering the effect of imitation on tonal patterns of French Accentual Phrases
Amandine Michelas (Aix-Marseille Université & Laboratoire Parole et Langage) Noël Nguyen (Aix-Marseille Université & Laboratoire Parole et Langage)
French accentual phrases (APs) are characterized by the presence of a typical final fo rise and an optional/additional initial fo rise. This study tested whether between-speaker speech imitation influenced the realization of APs tonal patterns. The experiment was based on 3-syllable APs, whose tonal patterns differed in the potential placement of an initial rise. In two shadowing tasks (without/with explicit instructions to imitate the speaker’s way of pronouncing the stimuli), participants produced more initial rises when they heard a stimulus including both initial and final rises relative to stimuli which only a final rise was present. Thus, imitation influences the realization of APs tonal patterns in French.
|
11:20 | Crossmodal prosodic and gestural contribution to the perception of contrastive focus to the perception of contrastive focus
Pilar Prieto (ICREA- Universitat Pompeu Fabra) Cecilia Pugliesi (Universitat Pompeu Fabra) Joan Borràs-Comes (Universitat Pompeu Fabra) Ernesto Arroyo (Universitat Pompeu Fabra) Josep Blat (Universitat Pompeu Fabra)
Speech prosody has traditionally been analyzed in terms of acoustic features. Even though visual features and gestures have been shown to help and enhance linguistic processing, the conventional view is that facial and body gesture information in oral (non-sign) languages tends to be redundant and has the role of helping the hearer recover the meaning of an utterance. We conducted two perception experiments with a 3D animated character showing conflicting auditory and visual information to investigate two related questions regarding the importance of gestures in conveying prosodic meaning: (a) how important are facial cues with respect to auditory cues for the perception of contrastive focus?; and (b) what is the relevance of the different gestural movements (i.e., head nod and eyebrow raising) for the perception of this type of focus? Our findings reveal that the visual component is crucial in the semantic interpretation of contrastive focus.
|
11:40 | Temporal relationship between auditory and visual prosodic cues
Erin Cvejic (MARCS Auditory Laboratories, University of Western Sydney, Australia) Jeesun Kim (MARCS Auditory Laboratories, University of Western Sydney, Australia) Chris Davis (MARCS Auditory Laboratories, University of Western Sydney, Australia)
It has been reported that non-articulatory visual cues to prosody tend to align with auditory cues, emphasizing auditory events that are in close alignment (visual alignment hypothesis). We investigated the temporal relationship between visual and auditory prosodic cues in a large corpus of utterances to determine the extent to which non-articulatory visual prosodic cues align with auditory ones. Six speakers saying 30 sentences in three prosodic conditions (x2 repetitions) were recorded in a dialogue exchange task, to measure how often eyebrow movements and rigid head tilts aligned with auditory prosodic cues, the temporal distribution of such movements, and the variation across prosodic conditions. The timing of brow raises and head tilts were not aligned with auditory cues, and the occurrence of visual cues was inconsistent, lending little support for the visual alignment hypothesis. Different types of visual cues may combine with auditory cues in different ways to signal prosody.
|
|
|