|
12thAnnual Conference of the
International Speech Communication Association
|
sponsors
|
Interspeech 2011 Florence |
Technical Programme
This is the final programme for this session. For oral sessions, the timing on the left is the current presentation order, but this may still change, so please check at the conference itself. If you have signed in to My Schedule, you can add papers to your own personalised list.
Tue-Ses2-O4: Spoken Dialogue Systems I
Time: | Tuesday 13:30 |
Place: | Michelangelo - Pala Affari - 2nd Floor |
Type: | Oral |
Chair: | Olivier Pietquin |
13:30 | User Study of Spoken Decision Support System
Teruhisa Misu (NICT) Kiyonori Ohtake (NICT) Chiori Hori (NICT) Hisashi Kawai (NICT) Satoshi Nakamura (NICT)
This paper presents the results of the user evaluation of spoken
decision support dialogue systems, which help users select
from a set of alternatives. Thus far, we have modeled this decision
support dialogue as a partially observable Markov decision
process (POMDP), and optimized its dialogue strategy to maximize
the value of the user’s decision. In this paper, we present a
comparative evaluation of the optimized dialogue strategy with
several baseline methods, and demonstrate that the optimized
dialogue strategy that was effective in user simulation experiments
works well in an evaluation by real users.
|
13:50 | Efficient Probabilistic Tracking of User Goal and Dialog History for Spoken Dialog Systems
Antoine Raux (Honda Research Institute USA) Yi Ma (Ohio State University)
In this paper, we describe Dynamic Probabilistic Ontology Trees, a new probabilistic model to track dialog state in a dialog system. Our model captures both the user goal and the history of user dialog acts using a unified Bayesian Network. We perform efficient inference using a form of blocked Gibbs sampling designed to exploit the structure of the model. Evaluation on a corpus of dialogs from the CMU Let's Go system shows that our approach significantly outperforms a deterministic baseline and is able to exploit long N-best lists without loss of accuracy.
|
14:10 | Tackling a Shilly-Shally Classifier for Predicting Task Success in Spoken Dialogue Interaction
Alexander Schmitt (University of Ulm, Germany) Alexander Zgorzelski (University of Ulm, Germany) Wolfgang Minker (University of Ulm, Germany)
Statistical models, which predict that a task with a telephone-based Spoken Dialogue System (SDS) is unlikely to be completed, can be useful to adapt dialogue strategies. They can also trigger the decision to route callers directly to human assistance once it is clear that the SDS cannot automate the call.
This paper addresses a number of issues that arise when deploying such models. We show that the predictions of a model are subject to strong variations between several adjacent dialogue steps. As a consequence, we show that the accuracy can be significantly risen when using sequences of equal predictions as basis of the decision-making. Furthermore, we implement
a confidence metric that takes into account the certainty of the classifier to determine the optimum decision point.
|
14:30 | Evaluation of Listening-oriented Dialogue Control Rules based on the Analysis of HMMs
Toyomi Meguro (NTT Communication Science Laboratories, NTT Corporation) Ryuichiro Higashinaka (NTT Cyber Space Laboratories, NTT Corporation) Yasuhiro Minami (NTT Communication Science Laboratories, NTT Corporation) Kohji Dohsaka (NTT Communication Science Laboratories, NTT Corporation)
We have been working on listening-oriented dialogues for the
purpose of building listening agents. In our previous work
[1], we trained hidden Markov models (HMMs) from listeningoriented
dialogues (LoDs) between humans, and by analyzing
them, discovered a distinguishing dialogue flow of LoD. For
example, listeners suppress their information giving and selfdisclosure,
and, instead, increase acknowledgments and questions
to lead speakers’ utterances. As a initial step for building
listening agents, we decided to create dialogue control rules
based on our analysis of the HMMs. We built our rule-based
system and compared it with three other systems by aWizard of
Oz (WoZ) experiment. As a result, we found that our rule-based
system achieved as much user satisfaction as human listeners.
|
14:50 | Large-Scale Experiments on Data-Driven Design of Commercial Spoken Dialog Systems
David Suendermann (SpeechCycle) Jackson Liscombe (SpeechCycle) Jonathan Bloom (SpeechCycle) Grace Li (SpeechCycle) Roberto Pieraccini (SpeechCycle)
The design of commercial spoken dialog systems is most commonly based on hand-crafting call flows. Voice interaction designers write prompts, predict caller responses, set speech recognition parameters, implement interaction strategies, all based on ``best design practices''. Recently, we presented the mathematical framework ``Contender'' (similar to reinforcement learning) that allows for replacing manual decisions made during system design by data-driven soft decisions made at system run time optimizing the cumulative reward of an application. The current paper reports on the results of 26 Contenders implemented in commercial applications processing a total of about 15 million calls.
|
15:10 | Comparing system-driven and free dialogue in in-vehicle interaction
Fredrik Kronlid (Talkamatic AB) Jessica Villing (University of Gothenburg) Alexander Berman (Talkamatic AB) Staffan Larsson (University of Gothenburg)
It is widely held that a free, natural dialogue model is more efficient and less distracting than system-initiative, state based dialogue. This paper describes an evaluation of two systems - one using system-directed dialogue and one using a more "free" dialogue - focusing on distraction and efficiency. The level of distraction is measured using an automotive industry standard test (LCT), and the efficiency is measured by counting the number of completed tasks. The efficiency is increased by 42 % using the free, natural dialogue model while the LCT results are unclear. Using a free dialogue model increases the efficiency and reduces the distraction in some cases.
|
|
|