IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) 2009, pp 323-326.
Department of Computer Science, University of Texas at El Paso
Abstract: In spoken dialog, speakers are simultaneously engaged in various mental processes, and it seems likely that the word that will be said next depends, to some extent, on the states of these mental processes. Further, these states can be inferred, to some extent, from properties of the speaker's voice as they change from moment to moment. As a illustration of how to apply these ideas in language modeling, we examine volume and speaking rate as predictors of the upcoming word. Combining this information with a trigram model gave a 2.6% improvement in perplexity.
Full Paper (pdf)
Nigel Ward's Publications