Huang Hui


2024

pdf bib
Mitigating the Bias of Large Language Model Evaluation
Zhou Hongli | Huang Hui | Long Yunfei | Xu Bing | Zhu Conghui | Cao Hailong | Yang Muyun | Zhao Tiejun
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)

“Recently, there has been a trend of evaluating the Large Language Model (LLM) quality in theflavor of LLM-as-a-Judge, namely leveraging another LLM to evaluate the current output qual-ity. However, existing judges are proven to be biased, namely they would favor answers whichpresent better superficial quality (such as verbosity, fluency) while ignoring the instruction fol-lowing ability. In this work, we propose systematic research about the bias of LLM-as-a-Judge.Specifically, for closed-source judge models, we apply calibration to mitigate the significance ofsuperficial quality, both on probability level and prompt level. For open-source judge models, wepropose to mitigate the bias by contrastive training, with curated negative samples that deviatefrom instruction but present better superficial quality. We apply our methods on the bias evalu-ation benchmark, and experiment results show our methods mitigate the bias by a large marginwhile maintaining a satisfactory evaluation accuracy.”

2023

pdf bib
Improving Zero-shot Cross-lingual Dialogue State Tracking via Contrastive Learning
Xiang Yu | Zhang Ting | Di Hui | Huang Hui | Li Chunyou | Ouchi Kazushige | Chen Yufeng | Xu Jinan
Proceedings of the 22nd Chinese National Conference on Computational Linguistics

“Recent works in dialogue state tracking (DST) focus on a handful of languages, as collectinglarge-scale manually annotated data in different languages is expensive. Existing models addressthis issue by code-switched data augmentation or intermediate fine-tuning of multilingual pre-trained models. However, these models can only perform implicit alignment across languages. In this paper, we propose a novel model named Contrastive Learning for Cross-Lingual DST(CLCL-DST) to enhance zero-shot cross-lingual adaptation. Specifically, we use a self-builtbilingual dictionary for lexical substitution to construct multilingual views of the same utterance. Then our approach leverages fine-grained contrastive learning to encourage representations ofspecific slot tokens in different views to be more similar than negative example pairs. By thismeans, CLCL-DST aligns similar words across languages into a more refined language-invariantspace. In addition, CLCL-DST uses a significance-based keyword extraction approach to selecttask-related words to build the bilingual dictionary for better cross-lingual positive examples. Experiment results on Multilingual WoZ 2.0 and parallel MultiWoZ 2.1 datasets show that ourproposed CLCL-DST outperforms existing state-of-the-art methods by a large margin, demon-strating the effectiveness of CLCL-DST.”

2022

pdf bib
Towards Making the Most of Pre-trained Translation Model for Quality Estimation
Li Chunyou | Di Hui | Huang Hui | Ouchi Kazushige | Chen Yufeng | Liu Jian | Xu Jinan
Proceedings of the 21st Chinese National Conference on Computational Linguistics

“Machine translation quality estimation (QE) aims to evaluate the quality of machine translation automatically without relying on any reference. One common practice is applying the translation model as a feature extractor. However, there exist several discrepancies between the translation model and the QE model. The translation model is trained in an autoregressive manner, while the QE model is performed in a non-autoregressive manner. Besides, the translation model only learns to model human-crafted parallel data, while the QE model needs to model machinetranslated noisy data. In order to bridge these discrepancies, we propose two strategies to posttrain the translation model, namely Conditional Masked Language Modeling (CMLM) and Denoising Restoration (DR). Specifically, CMLM learns to predict masked tokens at the target side conditioned on the source sentence. DR firstly introduces noise to the target side of parallel data, and the model is trained to detect and recover the introduced noise. Both strategies can adapt the pre-trained translation model to the QE-style prediction task. Experimental results show that our model achieves impressive results, significantly outperforming the baseline model, verifying the effectiveness of our proposed methods.”

pdf bib
Supervised Contrastive Learning for Cross-lingual Transfer Learning
Wang Shuaibo | Di Hui | Huang Hui | Lai Siyu | Ouchi Kazushige | Chen Yufeng | Xu Jinan
Proceedings of the 21st Chinese National Conference on Computational Linguistics

“Multilingual pre-trained representations are not well-aligned by nature, which harms their performance on cross-lingual tasks. Previous methods propose to post-align the multilingual pretrained representations by multi-view alignment or contrastive learning. However, we argue that both methods are not suitable for the cross-lingual classification objective, and in this paper we propose a simple yet effective method to better align the pre-trained representations. On the basis of cross-lingual data augmentations, we make a minor modification to the canonical contrastive loss, to remove false-negative examples which should not be contrasted. Augmentations with the same class are brought close to the anchor sample, and augmentations with different class are pushed apart. Experiment results on three cross-lingual tasks from XTREME benchmark show our method could improve the transfer performance by a large margin with no additional resource needed. We also provide in-detail analysis and comparison between different post-alignment strategies.”