The importance of fillers for text representations of speech transcripts

Tanvi Dinkar, Pierre Colombo, Matthieu Labeau, Chloé Clavel


Abstract
While being an essential component of spoken language, fillers (e.g. “um” or “uh”) often remain overlooked in Spoken Language Understanding (SLU) tasks. We explore the possibility of representing them with deep contextualised embeddings, showing improvements on modelling spoken language and two downstream tasks — predicting a speaker’s stance and expressed confidence.
Anthology ID:
2020.emnlp-main.641
Volume:
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Month:
November
Year:
2020
Address:
Online
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7985–7993
Language:
URL:
https://aclanthology.org/2020.emnlp-main.641
DOI:
10.18653/v1/2020.emnlp-main.641
Bibkey:
Cite (ACL):
Tanvi Dinkar, Pierre Colombo, Matthieu Labeau, and Chloé Clavel. 2020. The importance of fillers for text representations of speech transcripts. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 7985–7993, Online. Association for Computational Linguistics.
Cite (Informal):
The importance of fillers for text representations of speech transcripts (Dinkar et al., EMNLP 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.emnlp-main.641.pdf
Video:
 https://slideslive.com/38938831