Xin Tan

2025

A Benchmark for Translations Across Styles and Language Variants
Xin Tan | Bowei Zou | AiTi Aw
Findings of the Association for Computational Linguistics: EMNLP 2025

As machine translation (MT) rapidly advances in bridging global communication gaps, there is growing interest in variety-targeted translation for fine-grained language variants and specific translation styles. This translation variant aims to generate target outputs that are not only contextually accurate but also culturally sensitive. However, the lack of comprehensive evaluation benchmarks has hindered progress in this field. To bridge this gap, this work focuses on the translation across styles and language variants, aiming to establish a robust foundation for the automatic evaluation of fine-grained cultural and stylistic nuances, thereby fostering innovation in culturally sensitive translations. Specifically, we evaluate translations across four key dimensions: semantic preservation, cultural and regional specificity, expression style, and fluency at both the word and sentence levels. Through detailed human evaluations, we validate the high reliability of the proposed evaluation framework. On this basis, we thoroughly assess translations of state-of-the-art large language models (LLMs) for this task, highlighting their strengths and identifying areas for future improvement.

pdf bib abs

Improving Explainable Fact-Checking with Claim-Evidence Correlations
Xin Tan | Bowei Zou | Ai Ti Aw
Proceedings of the 31st International Conference on Computational Linguistics

Automatic fact-checking systems that employ large language models (LLMs) have achieved human-level performance in combating widespread misinformation. However, current LLM-based fact-checking systems fail to reveal the reasoning principles behind their decision-making for the claim verdict. In this work, we propose Correlation-Enhanced Explainable Fact-Checking (CorXFact), an LLM-based fact-checking system that simulates the reasoning principle of human fact-checkers for evidence-based claim verification: assessing and weighing the correlations between the claim and each piece of evidence. Following this principle, CorXFact enables efficient claim verification and transparent explanation generation. Furthermore, we contribute the CorFEVER test set to comprehensively evaluate the CorXFact system in claim-evidence correlation identification and claim verification in both closed-domain and real-world fact-checking scenarios. Experimental results show that our proposed CorXFact significantly outperforms four strong fact-checking baselines in claim authenticity prediction and verdict explanation.

2024

pdf bib abs

The rapid advancement of Large Language Models (LLMs) has brought about remarkable generative capabilities but also raised concerns about their potential misuse. While strategies like supervised fine-tuning and reinforcement learning from human feedback have enhanced their safety, these methods primarily focus on natural languages, which may not generalize to other domains. This paper introduces CodeAttack, a framework that transforms natural language inputs into code inputs, presenting a novel environment for testing the safety generalization of LLMs. Our comprehensive studies on state-of-the-art LLMs including GPT-4, Claude-2, and Llama-2 series reveal a new and universal safety vulnerability of these models against code input: CodeAttack bypasses the safety guardrails of all models more than 80% of the time. We find that a larger distribution gap between CodeAttack and natural language leads to weaker safety generalization, such as encoding natural language input with data structures. Furthermore, we give our hypotheses about the success of CodeAttack: the misaligned bias acquired by LLMs during code training, prioritizing code completion over avoiding the potential safety risk. Finally, we analyze potential mitigation measures. These findings highlight new safety risks in the code domain and the need for more robust safety alignment algorithms to match the code capabilities of LLMs.

2021

pdf bib abs

Coupling Context Modeling with Zero Pronoun Recovering for Document-Level Natural Language Generation
Xin Tan | Longyin Zhang | Guodong Zhou
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Natural language generation (NLG) tasks on pro-drop languages are known to suffer from zero pronoun (ZP) problems, and the problems remain challenging due to the scarcity of ZP-annotated NLG corpora. In this case, we propose a highly adaptive two-stage approach to couple context modeling with ZP recovering to mitigate the ZP problem in NLG tasks. Notably, we frame the recovery process in a task-supervised fashion where the ZP representation recovering capability is learned during the NLG task learning process, thus our method does not require NLG corpora annotated with ZPs. For system enhancement, we learn an adversarial bot to adjust our model outputs to alleviate the error propagation caused by mis-recovered ZPs. Experiments on three document-level NLG tasks, i.e., machine translation, question answering, and summarization, show that our approach can improve the performance to a great extent, and the improvement on pronoun translation is very impressive.

pdf bib abs

EDTC: A Corpus for Discourse-Level Topic Chain Parsing
Longyin Zhang | Xin Tan | Fang Kong | Guodong Zhou
Findings of the Association for Computational Linguistics: EMNLP 2021

Discourse analysis has long been known to be fundamental in natural language processing. In this research, we present our insight on discourse-level topic chain (DTC) parsing which aims at discovering new topics and investigating how these topics evolve over time within an article. To address the lack of data, we contribute a new discourse corpus with DTC-style dependency graphs annotated upon news articles. In particular, we ensure the high reliability of the corpus by utilizing a two-step annotation strategy to build the data and filtering out the annotations with low confidence scores. Based on the annotated corpus, we introduce a simple yet robust system for automatic discourse-level topic chain parsing.

2019

pdf bib abs

Hierarchical Modeling of Global Context for Document-Level Neural Machine Translation
Xin Tan | Longyin Zhang | Deyi Xiong | Guodong Zhou
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Document-level machine translation (MT) remains challenging due to the difficulty in efficiently using document context for translation. In this paper, we propose a hierarchical model to learn the global context for document-level neural machine translation (NMT). This is done through a sentence encoder to capture intra-sentence dependencies and a document encoder to model document-level inter-sentence consistency and coherence. With this hierarchical architecture, we feedback the extracted global document context to each word in a top-down fashion to distinguish different translations of a word according to its specific surrounding context. In addition, since large-scale in-domain document-level parallel corpora are usually unavailable, we use a two-step training strategy to take advantage of a large-scale corpus with out-of-domain parallel sentence pairs and a small-scale corpus with in-domain parallel document pairs to achieve the domain adaptability. Experimental results on several benchmark corpora show that our proposed model can significantly improve document-level translation performance over several strong NMT baselines.

Co-authors

Wai Lam 1

Venues

Fix author