Dante - Di Michelino 150° sponsors







Corporate & Society Sponsors
Loquendo diamond package
Nuance gold package
ATT bronze package
Google silver package
Appen bronze package
Appen bronze package
Interactive Media bronze package
Microasoft bronze package
SpeechOcean bronze package
Avios logo package
NDI logo package
NDI logo package

CNR-ISTC

CNR-ISTC
Universit柤e Avignon
Speech Cycle
AT&T
Universit�i Firenze
FUB
FBK
Univ. Trento
Univ. Napoli
Univ. Tuscia
Univ. Calabria
Univ. Venezia

AISV
AISV

AISV
AISV
Comune di Firenze
Firenze Fiera
Florence Convention Bureau

ISCA

12thAnnual Conference of the
International Speech Communication Association

Sponsors
sponsors

Interspeech 2011 Florence

Technical Programme

This is the final programme for this session. For oral sessions, the timing on the left is the current presentation order, but this may still change, so please check at the conference itself. If you have signed in to My Schedule, you can add papers to your own personalised list.

Wed-Ses1-O2:
Prosody I

Time:Wednesday 10:00 Place:Leonardo - Pala Affari - Ground Floor Type:Oral
Chair:Gérard Bailly

10:00A quantitative investigation of the prosody of Verum Focus in Italian

Giuseppina Turco (Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands)
Michele Gubian (Centre for Language & Speech Technology, Radboud University, Nijmegen, The Netherlands)
Jessamyn Schertz (Centre for Language & Speech Technology, Radboud University, Nijmegen, The Netherlands)

In this study we present a preliminary investigation of the prosodic marking of Verum focus (VF) in Italian, which is said to be realized with a pitch accent on the finite verb (e.g. A: Paul has not eaten the banana - B: (No), Paul HAS eaten the banana!). We tried to discover whether and how Italian speakers prosodically mark VF when producing full-fledged sentences using a semi-spontaneous production experiment on 27 speakers. Speech rate and f0 contours were extracted using automatic data processing tools and were subsequently analysed using Functional Data Analysis (FDA), which allowed for automatic visualization of patterns in the contour shapes. Our results show that the postfocal region of VF sentences exhibit faster speech rate and lower f0 compared to non-VF cases. However, an expected consistent difference of f0 effect on the focal region of the VF sentence was not found in this analysis.

10:20Effects of focus on f0 and duration in Irish (Gaelic) declaratives

Amelie Dorn (Trinity College Dublin)
Ailbhe Ní Chasaide (Trinity College Dublin)

This pilot study investigates the effects of focus (broad, narrow and contrastive) on tonal patterns, f0 scaling and duration of accented syllables and rhythmic feet for a controlled dataset in Donegal Irish. Results show differences in pre-focal tonal patterns between broad focus and the other focus types. Narrow and contrastive focus renditions are implemented by largely the same phonetic means. Focused domains are overall longer in duration and have wider f0 excursion than broad focus. Durational differences depend on sentence position.

10:40The phonology and phonetics of perceived prosody: What do listeners imitate?

Jennifer Cole (University of Illinois at Urbana-Champaign)
Stefanie Shattuck-Hufnagel (Massachusetts Institute of Technology)

An imitation experiment tests the hypothesis that when asked to reproduce a spontaneously-spoken utterance that they hear, speakers imitate the prosody of the stimulus in its phonological structure more accurately than the phonetic details. Results suggest that speakers rarely distort the presence of a pitch accent or an intonational phrase boundary, but more often change the nature of the phonetic cues, e.g. the duration of a pause or the occurrence of irregular pitch periods associated with boundaries and accents in American English. These findings argue for an encoding of phonological prosodic structure that is separate from the phonetic cues that signal that structure.

11:00Uncovering the effect of imitation on tonal patterns of French Accentual Phrases

Amandine Michelas (Aix-Marseille Université & Laboratoire Parole et Langage)
Noël Nguyen (Aix-Marseille Université & Laboratoire Parole et Langage)

French accentual phrases (APs) are characterized by the presence of a typical final fo rise and an optional/additional initial fo rise. This study tested whether between-speaker speech imitation influenced the realization of APs tonal patterns. The experiment was based on 3-syllable APs, whose tonal patterns differed in the potential placement of an initial rise. In two shadowing tasks (without/with explicit instructions to imitate the speaker’s way of pronouncing the stimuli), participants produced more initial rises when they heard a stimulus including both initial and final rises relative to stimuli which only a final rise was present. Thus, imitation influences the realization of APs tonal patterns in French.

11:20Crossmodal prosodic and gestural contribution to the perception of contrastive focus to the perception of contrastive focus

Pilar Prieto (ICREA- Universitat Pompeu Fabra)
Cecilia Pugliesi (Universitat Pompeu Fabra)
Joan Borràs-Comes (Universitat Pompeu Fabra)
Ernesto Arroyo (Universitat Pompeu Fabra)
Josep Blat (Universitat Pompeu Fabra)

Speech prosody has traditionally been analyzed in terms of acoustic features. Even though visual features and gestures have been shown to help and enhance linguistic processing, the conventional view is that facial and body gesture information in oral (non-sign) languages tends to be redundant and has the role of helping the hearer recover the meaning of an utterance. We conducted two perception experiments with a 3D animated character showing conflicting auditory and visual information to investigate two related questions regarding the importance of gestures in conveying prosodic meaning: (a) how important are facial cues with respect to auditory cues for the perception of contrastive focus?; and (b) what is the relevance of the different gestural movements (i.e., head nod and eyebrow raising) for the perception of this type of focus? Our findings reveal that the visual component is crucial in the semantic interpretation of contrastive focus.

11:40Temporal relationship between auditory and visual prosodic cues

Erin Cvejic (MARCS Auditory Laboratories, University of Western Sydney, Australia)
Jeesun Kim (MARCS Auditory Laboratories, University of Western Sydney, Australia)
Chris Davis (MARCS Auditory Laboratories, University of Western Sydney, Australia)

It has been reported that non-articulatory visual cues to prosody tend to align with auditory cues, emphasizing auditory events that are in close alignment (visual alignment hypothesis). We investigated the temporal relationship between visual and auditory prosodic cues in a large corpus of utterances to determine the extent to which non-articulatory visual prosodic cues align with auditory ones. Six speakers saying 30 sentences in three prosodic conditions (x2 repetitions) were recorded in a dialogue exchange task, to measure how often eyebrow movements and rigid head tilts aligned with auditory prosodic cues, the temporal distribution of such movements, and the variation across prosodic conditions. The timing of brow raises and head tilts were not aligned with auditory cues, and the occurrence of visual cues was inconsistent, lending little support for the visual alignment hypothesis. Different types of visual cues may combine with auditory cues in different ways to signal prosody.