Implications of Deep Learning for Dialog Modeling
Although deep learning has transformed many speech technologies, the impact on dialog has so far been more modest. This special session will explore opportunities and challenges for dialog research in the deep learning era.
Late Breaking Results
Domain-Independent Turn-Level Dialogue Quality Estimation via User Satisfaction Estimation. Praveen Kumar Bodigutla, Longshaokan Wang, Kate Ridgeway, Joshua Levy, Swanand Joshi, Alborz Geramifard, Spyros Matsoukas.
Predictive Turn-Taking Decisions with POMDPs. Matthew Roddy, Naomi Harte
Motivating Questions (from the Call for Papers)
- Current target functions for machine learning of dialog behaviors are simple and use very limited information. In human-human dialogs, by contrast, the interactants continuously give each other subtle feedback and guidance. Can we use such signals, or a proxy such as continuous quality annotations, to help train dialog systems? More generally, what new data sets and annotation types are needed for deep learning for dialog?
- How does deep learning change the possibilities for modularity and portability of dialog systems? Will modules optimized by multi-task learning correspond to the traditional modules used by dialog systems builders? Alternatively, how can dialog-level metrics be used to improve training and tuning of component technologies? Can deep learning improve the quality of decisions that involve trade-offs across multiple modules, for example, making turn-taking decisions not in isolation but also on the basis of considerations of ASR uncertainty, backend delay estimates, and speech synthesis limitations? Can transfer learning techniques supplant the use of reusable modules as a strategy for enabling adaptation of dialog systems to new domains?
- Traditionally dialog systems development has involved a human-designed skeleton, integrated with targeted uses of machine learning to optimize certain key decisions as the skeleton is fleshed out. If this is reversed, such that the skeleton is end-to-end trained, how will the contributions of human designers be integrated, including design heuristics, models of semantics and pragmatics, policies required by business logic, and methods for creating API calls?
-
Beyond task-oriented and chat systems, how can deep learning support the development of dialog systems for genres such as tutoring, behavior change coaching, physically situated dialog, information navigation and so on? Beyond dialog systems building, how can deep learning inform other work in dialog, for example for assessment, for teaching second language learners, for communication skills training, for information extraction from recorded dialogs, and for understanding the cognitive mechanisms underlying dialog skills in humans?
- Other topics relevant to deep learning and dialog modeling.
Submission Process
Full papers were submitted through the
general SIGdial mechanism, were selected after evaluation by the
regular SIGdial peer review process and were presented
orally. Work-in-progress (late-breaking results) papers were solicited
as presentations of ongoing work, circumscribed contributions, or
focused programmatic proposals. These were reviewed by the special
session organizing committee and presented as posters.
Organizers
Yun-Nung (Vivian) Chen, National Taiwan University
Gabriel Skantze, KTH, Stockholm
Tatsuya
Kawahara, Kyoto University
Nigel G. Ward, University of Texas at El Paso