Interspeech, 2012.
Department of Computer Science, University of Texas at El Paso
Abstract: Inspired by the goal of modeling the dialog state and the speaker's mental state, moment by moment, we apply Principal Component Analysis to a vector of 76 prosodic features spanning 6 seconds of context. This gives a multidimensional representation of the current state. We find that word probabilities vary strongly with several of these dimensions, that the use of this information in a language model gives a 27% reduction in perplexity, and that many of the dimensions do relate to aspects of mental state and dialog state. |