Adaptive Unsupervised Self-training for Disfluency Detection

Zhongyuan Wang, Yixuan Wang, Shaolei Wang, Wanxiang Che


Abstract
Supervised methods have achieved remarkable results in disfluency detection. However, in real-world scenarios, human-annotated data is difficult to obtain. Recent works try to handle disfluency detection with unsupervised self-training, which can exploit existing large-scale unlabeled data efficiently. However, their self-training-based methods suffer from the problems of selection bias and error accumulation. To tackle these problems, we propose an adaptive unsupervised self-training method for disfluency detection. Specifically, we re-weight the importance of each training example according to its grammatical feature and prediction confidence. Experiments on the Switchboard dataset show that our method improves 2.3 points over the current SOTA unsupervised method. Moreover, our method is competitive with the SOTA supervised method.
Anthology ID:
2022.coling-1.632
Volume:
Proceedings of the 29th International Conference on Computational Linguistics
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Editors:
Nicoletta Calzolari, Chu-Ren Huang, Hansaem Kim, James Pustejovsky, Leo Wanner, Key-Sun Choi, Pum-Mo Ryu, Hsin-Hsi Chen, Lucia Donatelli, Heng Ji, Sadao Kurohashi, Patrizia Paggio, Nianwen Xue, Seokhwan Kim, Younggyun Hahm, Zhong He, Tony Kyungil Lee, Enrico Santus, Francis Bond, Seung-Hoon Na
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
7209–7218
Language:
URL:
https://aclanthology.org/2022.coling-1.632
DOI:
Bibkey:
Cite (ACL):
Zhongyuan Wang, Yixuan Wang, Shaolei Wang, and Wanxiang Che. 2022. Adaptive Unsupervised Self-training for Disfluency Detection. In Proceedings of the 29th International Conference on Computational Linguistics, pages 7209–7218, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
Cite (Informal):
Adaptive Unsupervised Self-training for Disfluency Detection (Wang et al., COLING 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.coling-1.632.pdf
Code
 wyxstriker/reweightingdisfluency