Hsin-Yi Hsieh


2025

pdf bib
TaiwanVQA: A Benchmark for Visual Question Answering for Taiwanese Daily Life
Hsin-Yi Hsieh | Shang Wei Liu | Chang Chih Meng | Shuo-Yueh Lin | Chen Chien-Hua | Hung-Ju Lin | Hen-Hsen Huang | I-Chen Wu
Proceedings of the First Workshop of Evaluation of Multi-Modal Generation

We introduce TaiwanVQA, a novel visual question answering benchmark designed to evaluate vision language models’ (VLMs) ability to recognize and reason about Taiwan-specific multimodal content.TaiwanVQA comprises 2,000 image-question pairs covering diverse topics relevant to Taiwanese culture and daily life. We categorize the questions into recognition and reasoning tasks, further sub-classifying reasoning questions based on the level of external knowledge required. We conduct extensive experiments on state-of-the-art VLMs, including GPT-4o, Llama-3.2, LLaVA, Qwen2-VL, and InternVL2 models. Our findings reveal significant limitations in current VLMs when handling culturally specific content. The performance gap widens between recognition tasks (top score 73.60%) and reasoning tasks (top score 49.80%), indicating challenges in cultural inference and contextual understanding.These results highlight the need for more culturally diverse training data and improved model architectures that can better integrate visual and textual information within specific cultural contexts. By providing TaiwanVQA, we aim to contribute to the development of more inclusive and culturally aware AI models, facilitating their deployment in diverse real-world settings. TaiwanVQA can be accessed on our GitHub page.

2024

pdf bib
TWBias: A Benchmark for Assessing Social Bias in Traditional Chinese Large Language Models through a Taiwan Cultural Lens
Hsin-Yi Hsieh | Shih-Cheng Huang | Richard Tzong-Han Tsai
Findings of the Association for Computational Linguistics: EMNLP 2024

2023

pdf bib
MingOfficial: A Ming Official Career Dataset and a Historical Context-Aware Representation Learning Framework
You-Jun Chen | Hsin-Yi Hsieh | Yu Lin | Yingtao Tian | Bert Chan | Yu-Sin Liu | Yi-Hsuan Lin | Richard Tsai
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

In Chinese studies, understanding the nuanced traits of historical figures, often not explicitly evident in biographical data, has been a key interest. However, identifying these traits can be challenging due to the need for domain expertise, specialist knowledge, and context-specific insights, making the process time-consuming and difficult to scale. Our focus on studying officials from China’s Ming Dynasty is no exception. To tackle this challenge, we propose MingOfficial, a large-scale multi-modal dataset consisting of both structured (career records, annotated personnel types) and text (historical texts) data for 9,376 officials. We further couple the dataset with a a graph neural network (GNN) to combine both modalities in order to allow investigation of social structures and provide features to boost down-stream tasks. Experiments show that our proposed MingOfficial could enable exploratory analysis of official identities, and also significantly boost performance in tasks such as identifying nuance identities (e.g. civil officials holding military power) from 24.6% to 98.2% F1 score in hold-out test set. By making MingOfficial publicly available (see main text for the URL) as both a dataset and an interactive tool, we aim to stimulate further research into the role of social context and representation learning in identifying individual characteristics, and hope to provide inspiration for computational approaches in other fields beyond Chinese studies.