Simultaneous Selection and Adaptation of Source Data via Four-Level Optimization

Pengtao Xie, Xingchen Zhao, Xuehai He


Abstract
In many NLP applications, to mitigate data deficiency in a target task, source data is collected to help with target model training. Existing transfer learning methods either select a subset of source examples that are close to the target domain or try to adapt all source examples into the target domain, then use selected or adapted source examples to train the target model. These methods either incur significant information loss or bear the risk that after adaptation, source examples which are originally already in the target domain may be outside the target domain. To address the limitations of these methods, we propose a four-level optimization based framework which simultaneously selects and adapts source data. Our method can automatically identify in-domain and out-of-domain source examples and apply example-specific processing methods: selection for in-domain examples and adaptation for out-of-domain examples. Experiments on various datasets demonstrate the effectiveness of our proposed method.
Anthology ID:
2024.tacl-1.25
Volume:
Transactions of the Association for Computational Linguistics, Volume 12
Month:
Year:
2024
Address:
Cambridge, MA
Venue:
TACL
SIG:
Publisher:
MIT Press
Note:
Pages:
449–466
Language:
URL:
https://aclanthology.org/2024.tacl-1.25
DOI:
10.1162/tacl_a_00658
Bibkey:
Cite (ACL):
Pengtao Xie, Xingchen Zhao, and Xuehai He. 2024. Simultaneous Selection and Adaptation of Source Data via Four-Level Optimization. Transactions of the Association for Computational Linguistics, 12:449–466.
Cite (Informal):
Simultaneous Selection and Adaptation of Source Data via Four-Level Optimization (Xie et al., TACL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.tacl-1.25.pdf