Patomporn Payoungkhamdee
2024
An Empirical Study of Multilingual Reasoning Distillation for Question Answering
Patomporn Payoungkhamdee
|
Peerat Limkonchotiwat
|
Jinheon Baek
|
Potsawee Manakul
|
Can Udomcharoenchaikit
|
Ekapol Chuangsuwanich
|
Sarana Nutanong
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Reasoning is one crucial capability in Large Language Models (LLMs), allowing them to perform complex tasks such as solving math problems and multi-step planning. While reasoning capability can emerge in larger models, smaller ones usually have to rely on distillation to transfer this capability from a larger model. However, recent efforts to distill reasoning capabilities have focused mainly on English, leaving multilingual distillation underexplored. To address this gap, this paper examines existing English reasoning distillation methods that utilize a variety of positive rationales in multilingual settings and proposes d-CoT-nR, a novel approach that incorporates incorrect rationales as additional guidance. Empirical results from multilingual high-school examinations show that d-CoT-nR significantly surpasses the baseline, improving accuracy in unseen languages and correctness in step-by-step reasoning.
2022
Mitigating Spurious Correlation in Natural Language Understanding with Counterfactual Inference
Can Udomcharoenchaikit
|
Wuttikorn Ponwitayarat
|
Patomporn Payoungkhamdee
|
Kanruethai Masuk
|
Weerayut Buaphet
|
Ekapol Chuangsuwanich
|
Sarana Nutanong
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Despite their promising results on standard benchmarks, NLU models are still prone to make predictions based on shortcuts caused by unintended bias in the dataset. For example, an NLI model may use lexical overlap as a shortcut to make entailment predictions due to repetitive data generation patterns from annotators, also called annotation artifacts. In this paper, we propose a causal analysis framework to help debias NLU models. We show that (1) by defining causal relationships, we can introspect how much annotation artifacts affect the outcomes. (2) We can utilize counterfactual inference to mitigate bias with this knowledge. We found that viewing a model as a treatment can mitigate bias more effectively than viewing annotation artifacts as treatment. (3) In addition to bias mitigation, we can interpret how much each debiasing strategy is affected by annotation artifacts. Our experimental results show that using counterfactual inference can improve out-of-distribution performance in all settings while maintaining high in-distribution performance.