The Gaps between Fine Tuning and In-context Learning in Bias Evaluation and Debiasing

Masahiro Kaneko, Danushka Bollegala, Timothy Baldwin


Abstract
The output tendencies of PLMs vary markedly before and after FT due to the updates to the model parameters. These divergences in output tendencies result in a gap in the social biases of PLMs. For example, there exits a low correlation between intrinsic bias scores of a PLM and its extrinsic bias scores under FT-based debiasing methods. Additionally, applying FT-based debiasing methods to a PLM leads to a decline in performance in downstream tasks. On the other hand, PLMs trained on large datasets can learn without parameter updates via ICL using prompts. ICL induces smaller changes to PLMs compared to FT-based debiasing methods. Therefore, we hypothesize that the gap observed in pre-trained and FT models does not hold true for debiasing methods that use ICL. In this study, we demonstrate that ICL-based debiasing methods show a higher correlation between intrinsic and extrinsic bias scores compared to FT-based methods. Moreover, the performance degradation due to debiasing is also lower in the ICL case compared to that in the FT case.
Anthology ID:
2025.coling-main.187
Volume:
Proceedings of the 31st International Conference on Computational Linguistics
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2758–2764
Language:
URL:
https://aclanthology.org/2025.coling-main.187/
DOI:
Bibkey:
Cite (ACL):
Masahiro Kaneko, Danushka Bollegala, and Timothy Baldwin. 2025. The Gaps between Fine Tuning and In-context Learning in Bias Evaluation and Debiasing. In Proceedings of the 31st International Conference on Computational Linguistics, pages 2758–2764, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
The Gaps between Fine Tuning and In-context Learning in Bias Evaluation and Debiasing (Kaneko et al., COLING 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.coling-main.187.pdf