Capturing Formality in Speech Across Domains and Languages

Published in Interspeech, 2023

The linguistic notion of formality is one dimension of stylistic variation in human communication. A universal characteristic of language production, formality has surface-level realizations in written and spoken language. In this work, we explore ways of measuring the formality of such realizations in multilingual speech corpora across a wide range of domains. We compare measures of formality, contrasting textual and acoustic-prosodic metrics. We believe that a combination of these should correlate well with downstream applications. Our findings include: an indication that certain prosodic variables might play a stronger role than others; no correlation between prosodic and textual measures; limited evidence for anticipated inter-domain trends, but some evidence of consistency of measures between languages. We conclude that non-lexical indicators of formality in speech may be more subtle than our initial expectations, motivating further work on reliably encoding spoken formality.

Recommended citation: Bhattacharya, D., Chi, J., Hirschberg, J., Bell, P. (2023) Capturing Formality in Speech Across Domains and Languages. Proc. Interspeech 2023, 1030-1034, doi: 10.21437/Interspeech.2023-1852
Download Paper