Yao-Chung Fan

Also published as: Yao-chung Fan

2024

pdf bib abs
NLPNCHU at SemEval-2024 Task 4: A Comparison of MDHC Strategy and In-domain Pre-training for Multilingual Detection of Persuasion Techniques in Memes
Shih-wei Guo | Yao-chung Fan
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

This study presents a systematic method for identifying 22 persuasive techniques used in multilingual memes. We explored various fine-tuning techniques and classification strategies, such as data augmentation, problem transformation, and hierarchical multi-label classification strategies. Identifying persuasive techniques in memes involves a multimodal task. We fine-tuned the XLM-RoBERTA-large-twitter language model, focusing on domain-specific language modeling, and integrated it with the CLIP visual model’s embedding to consider image and text features simultaneously. In our experiments, we evaluated the effectiveness of our approach by using official validation data in English. Our system in the competition, achieving competitive rankings in Subtask1 and Subtask2b across four languages: English, Bulgarian, North Macedonian, and Arabic. Significantly, we achieved 2nd place ranking for Arabic language in Subtask 1.

2023

In this paper, we address the task of cloze-style multiple choice question (MCQs) distractor generation. Our study is featured by the following designs. First, we propose to formulate the cloze distractor generation as a Text2Text task. Second, we propose pseudo Kullback-Leibler Divergence for regulating the generation to consider the item discrimination index in education evaluation. Third, we explore the candidate augmentation strategy and multi-tasking training with cloze-related tasks to further boost the generation performance. Through experiments with benchmarking datasets, our best perfomring model advances the state-of-the-art result from 10.81 to 22.00 (p@1 score).

2022

pdf bib abs
CDGP: Automatic Cloze Distractor Generation based on Pre-trained Language Model
Shang-Hsuan Chiang | Ssu-Cheng Wang | Yao-Chung Fan
Findings of the Association for Computational Linguistics: EMNLP 2022

Manually designing cloze test consumes enormous time and efforts. The major challenge lies in wrong option (distractor) selection. Having carefully-design distractors improves the effectiveness of learner ability assessment. As a result, the idea of automatically generating cloze distractor is motivated. In this paper, we investigate cloze distractor generation by exploring the employment of pre-trained language models (PLMs) as an alternative for candidate distractor generation. Experiments show that the PLM-enhanced model brings a substantial performance improvement. Our best performing model advances the state-of-the-art result from 14.94 to 34.17 (NDCG@10 score). Our code and dataset is available at https://github.com/AndyChiangSH/CDGP.

pdf bib
Keyword Provision Question Generation for Facilitating Educational Reading Comprehension Preparation
Ying-Hong Chan | Ho-Lam Chung | Yao-Chung Fan
Proceedings of the 15th International Conference on Natural Language Generation

2020

pdf bib abs
A BERT-based Distractor Generation Scheme with Multi-tasking and Negative Answer Training Strategies.
Ho-Lam Chung | Ying-Hong Chan | Yao-Chung Fan
Findings of the Association for Computational Linguistics: EMNLP 2020

In this paper, we investigate the following two limitations for the existing distractor generation (DG) methods. First, the quality of the existing DG methods are still far from practical use. There are still room for DG quality improvement. Second, the existing DG designs are mainly for single distractor generation. However, for practical MCQ preparation, multiple distractors are desired. Aiming at these goals, in this paper, we present a new distractor generation scheme with multi-tasking and negative answer training strategies for effectively generating multiple distractors. The experimental results show that (1) our model advances the state-of-the-art result from 28.65 to 39.81 (BLEU 1 score) and (2) the generated multiple distractors are diverse and shows strong distracting power for multiple choice question.

2019

pdf bib abs
A Recurrent BERT-based Model for Question Generation
Ying-Hong Chan | Yao-Chung Fan
Proceedings of the 2nd Workshop on Machine Reading for Question Answering

In this study, we investigate the employment of the pre-trained BERT language model to tackle question generation tasks. We introduce three neural architectures built on top of BERT for question generation tasks. The first one is a straightforward BERT employment, which reveals the defects of directly using BERT for text generation. Accordingly, we propose another two models by restructuring our BERT employment into a sequential manner for taking information from previous decoded results. Our models are trained and evaluated on the recent question-answering dataset SQuAD. Experiment results show that our best model yields state-of-the-art performance which advances the BLEU 4 score of the existing best models from 16.85 to 22.17.

pdf bib abs
BERT for Question Generation
Ying-Hong Chan | Yao-Chung Fan
Proceedings of the 12th International Conference on Natural Language Generation

In this study, we investigate the employment of the pre-trained BERT language model to tackle question generation tasks. We introduce two neural architectures built on top of BERT for question generation tasks. The first one is a straightforward BERT employment, which reveals the defects of directly using BERT for text generation. And, the second one remedies the first one by restructuring the BERT employment into a sequential manner for taking information from previous decoded results. Our models are trained and evaluated on the question-answering dataset SQuAD. Experiment results show that our best model yields state-of-the-art performance which advances the BLEU4 score of existing best models from 16.85 to 18.91.

Co-authors