Towards a General-Purpose Model of Perceived Pragmatic Similarity

Nigel G. Ward, Andres Segura, Alejandro Ceballos, Divette Marco

Interspeech 2024

Abstract: Models for estimating the similarity between two utterances are fundamental in speech technology. While fairly good automatic measures exist for semantic similarity, pragmatic similarity has not been previously explored. Using a new collection of thousands of human judgments of the pragmatic similarity between utterance pairs, we train and evaluate various predictive models. The best performing model, which uses 103 features selected from HuBert's 24th layer, correlates on average 0.74 with human judges for the highest-quality data subset, and it sometimes approaches human inter-annotator agreement. We also find evidence for some degree of generality across languages.
paper, pdf
code, and also a new implementation
data
demo video

An example of an utterance pair rated highly similar by humans and by our model, but not by a lexical similarity model: F_M3l_BS_4f_02.wav

Nigel Ward's website