Turn-Taking Predictions Across Languages and Genres using an LSTM Recurrent Neural Network

IEEE Workshop on Spoken Language Technology (SLT), 2018. pages 831-827.

Nigel G. Ward, Diego Aguirre, Gerardo Cervantes, Olac Fuentes.

Abstract: Going beyond turn-taking models built to solve specific tasks, such as predicting if a user will hold his/her turn after a pause, there is growing interest in more general models for turn taking that subsume many such tasks, and very good results have recently been obtained (Skantze 2017). Here we present an improved recurrent network model that outperforms previous work and does so without requiring lexical annotation. Further, we show that this model can be trained for different languages with no modifications, providing good results in turn-taking prediction for English, Spanish, Japanese, Mandarin and French. We also show that our model performs well across genres, including task-oriented dialog and general conversation.

paper

preprocessing code

model code

Nigel Ward's Publications