Yu Xiang

2024

STAF: Pushing the Boundaries of Test-Time Adaptation towards Practical Noise Scenarios
Haoyu Xiong | Xinchun Zhang | Leixin Yang | Yu Xiang | Gang Fang
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Test-time adaptation (TTA) aims to adapt the neural network to the distribution of the target domain using only unlabeled test data. Most previous TTA methods have achieved success under mild conditions, such as considering only a single or multiple independent static domains. However, in real-world settings, the test data is sampled in a correlated manner and the test environments undergo continual changes over time, which may cause previous TTA methods to fail in practical noise scenarios, i.e., independent noise distribution shifts, continual noise distribution shifts, and continual mixed distribution shifts. To address these issues, we elaborate a Stable Test-time Adaptation Framework, called STAF, to stabilize the adaptation process. Specifically, to boost model robustness to noise distribution shifts, we present a multi-stream perturbation consistency method, enabling weak-to-strong views to be consistent, guided by the weak view from the original sample. Meanwhile, we develop a reliable memory-based corrector which utilizes reliable snapshots between the anchor model and the adapt model to correct prediction bias. Furthermore, we propose a dynamic parameter restoration strategy to alleviate error accumulation and catastrophic forgetting that takes into account both the distribution shift and sample adaptation degree. Extensive experiments demonstrate the robustness and effectiveness of STAF, which pushes the boundaries of test-time adaptation to more realistic scenarios and paves the way for stable deployment of real-world applications.

2023

pdf bib abs

“Recent works in dialogue state tracking (DST) focus on a handful of languages, as collectinglarge-scale manually annotated data in different languages is expensive. Existing models addressthis issue by code-switched data augmentation or intermediate fine-tuning of multilingual pre-trained models. However, these models can only perform implicit alignment across languages. In this paper, we propose a novel model named Contrastive Learning for Cross-Lingual DST(CLCL-DST) to enhance zero-shot cross-lingual adaptation. Specifically, we use a self-builtbilingual dictionary for lexical substitution to construct multilingual views of the same utterance. Then our approach leverages fine-grained contrastive learning to encourage representations ofspecific slot tokens in different views to be more similar than negative example pairs. By thismeans, CLCL-DST aligns similar words across languages into a more refined language-invariantspace. In addition, CLCL-DST uses a significance-based keyword extraction approach to selecttask-related words to build the bilingual dictionary for better cross-lingual positive examples. Experiment results on Multilingual WoZ 2.0 and parallel MultiWoZ 2.1 datasets show that ourproposed CLCL-DST outperforms existing state-of-the-art methods by a large margin, demon-strating the effectiveness of CLCL-DST.”

Co-authors

Venues

Fix author