DeCrisisMB: Debiased Semi-Supervised Learning for Crisis Tweet Classification via Memory Bank

Henry Zou, Yue Zhou, Weizhi Zhang, Cornelia Caragea


Abstract
During crisis events, people often use social media platforms such as Twitter to disseminate information about the situation, warnings, advice, and support. Emergency relief organizations leverage such information to acquire timely crisis circumstances and expedite rescue operations. While existing works utilize such information to build models for crisis event analysis, fully-supervised approaches require annotating vast amounts of data and are impractical due to limited response time. On the other hand, semi-supervised models can be biased, performing moderately well for certain classes while performing extremely poorly for others, resulting in substantially negative effects on disaster monitoring and rescue. In this paper, we first study two recent debiasing methods on semi-supervised crisis tweet classification. Then we propose a simple but effective debiasing method, DeCrisisMB, that utilizes a Memory Bank to store and perform equal sampling for generated pseudo-labels from each class at each training iteration. Extensive experiments are conducted to compare different debiasing methods’ performance and generalization ability in both in-distribution and out-of-distribution settings. The results demonstrate the superior performance of our proposed method. Our code is available at https://github.com/HenryPengZou/DeCrisisMB.
Anthology ID:
2023.findings-emnlp.406
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2023
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6104–6115
Language:
URL:
https://aclanthology.org/2023.findings-emnlp.406
DOI:
10.18653/v1/2023.findings-emnlp.406
Bibkey:
Cite (ACL):
Henry Zou, Yue Zhou, Weizhi Zhang, and Cornelia Caragea. 2023. DeCrisisMB: Debiased Semi-Supervised Learning for Crisis Tweet Classification via Memory Bank. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 6104–6115, Singapore. Association for Computational Linguistics.
Cite (Informal):
DeCrisisMB: Debiased Semi-Supervised Learning for Crisis Tweet Classification via Memory Bank (Zou et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-emnlp.406.pdf