Dante - Di Michelino 150° sponsors







Corporate & Society Sponsors
Loquendo diamond package
Nuance gold package
ATT bronze package
Google silver package
Appen bronze package
Appen bronze package
Interactive Media bronze package
Microasoft bronze package
SpeechOcean bronze package
Avios logo package
NDI logo package
NDI logo package

CNR-ISTC

CNR-ISTC
Universitè de Avignon
Speech Cycle
AT&T
Università di Firenze
FUB
FBK
Univ. Trento
Univ. Napoli
Univ. Tuscia
Univ. Calabria
Univ. Venezia

AISV
AISV

AISV
AISV
Comune di Firenze
Firenze Fiera
Florence Convention Bureau

ISCA

12thAnnual Conference of the
International Speech Communication Association

Sponsors
sponsors

Interspeech 2011 Florence

Keynotes

The Interspeech 2011 Organising committee is pleased to announce the following distinguished keynote speakers to give plenary talks at the conference:

This year we will host a round-table on the theme of the conference "Speech Science and Technology for life", entitled "Future and Applications of Speech and Language Technologies for the Good Health of Society". The following distinguished keynote speakers will share their research and vision:

Sincerely,
Giuseppe Riccardi
Plenary/Keynote Chair

Top

Sunday, August 28th 2011
2011 ISCA Medallist

Speaking More Like You: Entrainment in Conversational Speech

J.Hirshberg

Prof. Julia Hirschberg
Professor, Computer Science Columbia University
Department of Computer Science
http://www.cs.columbia.edu/~julia/

Abstract

When people engage in conversation, they adapt their speaking style of their conversational partner in a number of ways. They have been shown to adopt their interlocutor's way of describing objects and to align their accent, their syntax, their pitch range, and their speaking rate to their partner's -- as well as their gestures. They also adapt their turn-taking style and use of cue phrases to match their partner's. These types of entrainment have been shown to correlate with various measures of task success and dialogue naturalness. While there is considerable evidence for lexical entrainment from laboratory experiments, less is known about other types of acoustic-prosodic and discourse-level entrainment and little work has been done to examine entrainments in multiple modalities for the same dialogue. I will discuss research in entrainment in multiple dimensions on the Columbia Games Corpus and the Switchboard Corpus. Our goal is to understand how the different varieties of entrainment correlate with one another and to determine which types of entrainment will be both useful and feasible to model in Spoken Dialogue Systems.

Speaker Biography

Prof. Julia Hirschberg is professor of Computer Science at Columbia University, working on prosody, emotional speech and dialogue systems. She was editor-in-chief of Computational Linguistics from 1993-2003 and co-editor-in-chief of Speech Communication from 2003-2006. She served on the Executive Board of the Association for Computational Linguistics (ACL) from 1993-2003, on the Permanent Council of International Conference on Spoken Language Processing (ICSLP) since 1996, on the board of the International Speech Communication Association (ISCA) from 1999-2007 (as President 2005-2007), and is currently on the board of the CRA-W. She has been a fellow of the American Association for Artificial Intelligence since 1994 and an ISCA Fellow since 2008. She received an honorary doctorate from KTH in 2007 and the James L. Flanagan Speech and Audio Processing Award in 2011.

Top

Monday, August 29th, 2011

Neural Representations of Word Meanings

Tom M. Mitchell

Prof. Tom M. Mitchell
E. Fredkin University Professor
Chair, Machine Learning Department School of Computer Science
Carnegie Mellon University
http://www.cs.cmu.edu/~tom/

Abstract

How does the human brain represent meanings of words and pictures in terms of neural activity?  This talk will present our research addressing this question, by applying machine learning algorithms to fMRI and MEG brain image data.  One line of our research involves training classifiers that identify which word a person is thinking about, based on their observed neural activity.  A second line involves training computational models that predict the neural activity associated with arbitrary English words, including words for which we do not yet have brain image data. A third line of work involves examining neural activity at millisecond time resolution during the comprehension of words and phrases.

Speaker Biography

Professor Tom M. Mitchel is a professor in  the E. Fredkin University and head of the Machine Learning Department at Carnegie Mellon University. His research interests lie in cognitive neuroscience, machine learning, natural language processing, and artificial intelligence. Mitchell is a member of the US National Academy of Engineering, a Fellow of the American Association for the Advancement of Science (AAAS), and Fellow and Past President of the Association for the Advancement of Artificial Intelligence (AAAI). Mitchell believes the field of machine learning will be the fastest growing branch of computer science during the 21st century.

Top

Tuesday, August 30th, 2011

Honest Signals

Alex Sandy Pentland

Prof. Alex 'Sandy' Pentland
Toshiba Professor of Media, Arts, and Sciences
Massachusetts Institute of Technology
http://www.media.mit.edu/~pentland

Abstract

How did humans coordinate before we had sophisticated language capabilities?  Pre-linguistic social species coordinate by signaling, and in particular `honest signals' which actually cause changes in the listener. I will present examples of human behaviors that are likely honest signals, and show that they can be used to predict the outcomes of dyadic interactions (dating, negotiation, trust assessment, etc.) with an average accuracy of 80%.   Patterns of signaling also allow accurate identification of social and task roles in small groups, predict task performance in small groups, guide team formation, and understand aspects of organizational performance.   These experiments suggest that modern language evolved `on top' of ancient signaling mechanisms, and that today linguistic and signaling mechanisms operate in parallel.

Speaker Biography

Professor Alex 'Sandy' Pentland is a pioneer in computational social science, organizational engineering, and mobile information systems. He directs the MIT Human Dynamics Lab, developing computational social science and using this new science to guide organizational engineering.  He also directs the Media Lab Entrepreneurship Program, spinning off companies to bring MIT technologies into the real world.   He is among the most-cited computer scientists in the world.Professor Alex “Sandy” Pentland is a pioneer in computational social science, organizational engineering, and mobile information systems. He directs the MIT Human Dynamics Lab, developing computational social science and using this new science to guide organizational engineering.  He also directs the Media Lab Entrepreneurship Program, spinning off companies to bring MIT technologies into the real world.   He is among the most-cited computer scientists in the world.

Top

Wednesday, August 31st, 2011
Round Table

Future and Applications of Speech and Language Technologies for the Good Health of Society.

Prof. Gabriele Miceli
Language disorders: viewpoints on a complex object

Prof. Björn Granström
Speech technology in (re)habilitation of persons with communication disabilities

Prof. Hiroshi Ishiguro
From teleoperated androids to cellphones as surrogates

Top

Language disorders: viewpoints on a complex object

Gabriele Miceli

Prof. Gabriele Miceli
Università degli Studi di Trento
Dipartimento di Scienze della Cognizione e della Formazione
http://discof.unitn.it

Abstract

Aphasia is a common consequence of damage to the side of the brain, that severely limits the individual’s ability to communicate by means of language. In the US (population: 311 million; source: Census bureau), approximately 80,000 adults acquire aphasia each year, and about 1 million adults currently have aphasia (source: NIDCD). By 2020, in the US (projected population: 335 million) the incidence of aphasia is expected to rise to 180.000 cases, and the prevalence to 2 million persons. This is because: population is increasing and aging; better procedures for acute neurological conditions increase survival rates; and, more effective medications and maintenance regimens increase survivors’ life expectancy. These facts create an increasing need for intervention at all stages of the disease – early diagnosis, monitoring and treatment of language deficits; development of compensatory strategies and tools – at a time when health care systems worldwide (no matter whether private or socialized) deal with a severe economic crisis. I will discuss some recent applications that may make work on disorders of speech perception and speech production more effective. I will also discuss critical aspects of the communication disabilities observed in aphasic speakers, that make research in this area different from analogous work on unimpaired communication, and that relate to a paramount need to develop simulations of human language that are both functional and biologically viable.

Speaker Biography

Prof. Gabriele Miceli received his Medical degree from the Catholic University of Rome, and is board-certified in Neurology and in Psychiatry. He is currently professor of Neurology at the University of Trento, and clinical director of the Center for Neurocognitive Rehabilitation of CIMeC (Center for Mind/Brain Studies) in Rovereto. He had previously directed the Neuropsychological Evaluation Unit at the Institute of Neurology of the Catholic University (Rome) as an Associate professor of Neurology. Current research collaborations involve Harvard University, Johns Hopkins University, University of California at Irvine, Fondazione Bruno Kessler. His main research area is the functional and neural organization of language functions. In the past few years, his interests have extended to the neuroimaging investigation of the neuroplasticiy phenomena associated with training-induced recovery from language disorders. Present interests include the use of machine-learning algorithms in the diagnosis of language disorders, and the development of computer-assisted tools, based on conceptual lexicons, for the individualized treatment of language disorders.

Top

Speech technology in (re)habilitation of persons with communication disabilities

Björn Granström

Prof. Björn Granström
KTH - Royal Institute of Technology
TMH - Department of Speech, Music and Hearing, Stockholm, Sweden
http://www.speech.kth.se/~bjorn/

Abstract

Speech communication research and speech technology has found many applications for handicapped individuals. One of the very first examples of an application of speech synthesis was the reading machine for the blind. It is natural that results and devices in the speech communication field can be utilized for (re)habilitation of persons with communication disabilities. AAC - Augmentative and Alternative Communication – has evolved into an independent research area with strong input from speech and language processing. In this presentation we will look at the development of the field from very early speech training devices based on speech analysis to advanced systems including robotics and avatars capable of human like interaction. We will show examples where pressing needs of disabled persons have inspired avant-garde applications and development that have eventually spread to more general use in widely used applications. In this sense the “design for all” paradigm has been a rewarding and fruitful driving force for many speech communication and technology researchers.

Speaker Biography

Prof. Björn Granström joined the department in 1969, after graduating as MSc in Electrical Engineering. After further studies in Phonetics and General Linguistics at Stockholm University he became Doctor of Science at KTH in 1977 with the thesis "Perception and Synthesis of Speech". In 1987 he replaced Gunnar Fant as Professor in Speech Communication. He has been the director of CTT, The Center for Speech Technology, since its start in 1996. Together with Rolf Carlson, he created the first multilingual text-to-speech system, with extensive use in the disability area. Present interests include multi-modal verbal/non-verbal communication, virtual language tutors and human-like spoken dialogue systems.

Top

From teleoperated androids to cellphones as surrogates

Hiroshi Ishiguro

Prof. Hiroshi Ishiguro
Professor of Osaka University
Dept. of Adaptive Machine Systems
Intelligent Robotics Laboratory
Dept. of Systems Innovation
Graduate School of Engineering Science
Visiting Group Leader
Dept. of Communication Robots
ATR Intelligent Robotics and Communication Laboratories

Osaka University - Home Page

Abstract

In order to understand the meaning of human presence, we have developed Geminoid which is an teleoperated android of myself. With the android, we could learn how people can adapt the new media. Based on the knowledge, we have recently developed a simpler teleoperated android with the minimal humanlike appearance. The new android is called Telenoid. People can easily adapt to Telenoid and enjoy conversations by using it. Further, we are remaking it with a cell-phone size. It is called Elfoid. We believe the new type of cell-phone can transmit our presence to distant places and changes our life style again.

Speaker Biography

Prof. Hiroshi Ishiguro (M’) received a D.Eng. in systems engineering from the Osaka University, Japan in 1991. He is currently Professor in the Graduate School of Engineering at Osaka University (2002–). He is also Visiting Group Leader (2002–) of the Intelligent Robotics and Communication Laboratories at the Advanced Telecommunications Research Institute, where he previously worked as Visiting Researcher (1999–2002). He was previously Research Associate (1992–1994) in the Graduate School of Engineering Science at Osaka University and Associate Professor (1998–2000) in the Department of Social Informatics at Kyoto University. He was also Visiting Scholar (1998–1999) at the University of California, San Diego, USA. He then became Associate Professor (2000–2001) and Professor (2001–2002) in the Department of Computer and Communication Sciences at Wakayama University.

His Research Areas are omnidirectional vision, distributed vision, sensor networks, humanoid robots, and android robots. In the last several years, he has mainly focused on development of humanoids and androids. For developing them, he is getting over 2 million dollars every year as research grants from Japanese government and private companies. The most successful robot he has made was exhibited in Nagoya World Expo 05, for which he is recognized as the father of the world first android. The activities have been introduced by almost all major TVs, such as BBC, and newspapers in the world. Recently, he has been selected as one of 100 geniuses alive in this world by Synectics (UK, 2007).

Top

 


Sincerely,
Giuseppe Riccardi
Plenary/Keynote Chair