ACM Symposium on Eye Tracking Research and Applications, 2016
Abstract: A possible way to make video chat more efficient is to only send video frames that are likely to be looked at by the remote participant. Gaze in dialog is intimately tied to dialog states and behaviors, so prediction of such times should be possible. To investigate, we collected data on both participants in 6 video-chat sessions, totalling 65 minutes, and created a model to predict whether a participant will be looking at the screen 300 milliseconds in the future, based on prosodic and gaze information available at the other side. A simple predictor had a precision of 42% at the equal error rate. While this is probably not good enough to be useful, improved performance should be readily achievable.
Nigel Ward's Publications