Mattias Heldner


A Scalable Method for Quantifying the Role of Pitch in Conversational Turn-Taking
Kornel Laskowski | Marcin Wlodarczak | Mattias Heldner
Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue

Pitch has long been held as an important signalling channel when planning and deploying speech in conversation, and myriad studies have been undertaken to determine the extent to which it actually plays this role. Unfortunately, these studies have required considerable human investment in data preparation and analysis, and have therefore often been limited to a handful of specific conversational contexts. The current article proposes a framework which addresses these limitations, by enabling a scalable, quantitative characterization of the role of pitch throughout an entire conversation, requiring only the raw signal and speech activity references. The framework is evaluated on the Switchboard dialogue corpus. Experiments indicate that pitch trajectories of both parties are predictive of their incipient speech activity; that pitch should be expressed on a logarithmic scale and Z-normalized, as well as accompanied by a binary voicing variable; and that only the most recent 400 ms of the pitch trajectory are useful in incipient speech activity prediction.


3rd party observer gaze as a continuous measure of dialogue flow
Jens Edlund | Simon Alexandersson | Jonas Beskow | Lisa Gustavsson | Mattias Heldner | Anna Hjalmarsson | Petter Kallionen | Ellen Marklund
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We present an attempt at using 3rd party observer gaze to get a measure of how appropriate each segment in a dialogue is for a speaker change. The method is a step away from the current dependency of speaker turns or talkspurts towards a more general view of speaker changes. We show that 3rd party observers do indeed largely look at the same thing (the speaker), and how this can be captured and utilized to provide insights into human communication. In addition, the results also suggest that there might be differences in the distribution of 3rd party observer gaze depending on how information-rich an utterance is.