Zuoyu Tian


2021

pdf bib
Investigating Transfer Learning in Multilingual Pre-trained Language Models through Chinese Natural Language Inference
Hai Hu | He Zhou | Zuoyu Tian | Yiwen Zhang | Yina Patterson | Yanting Li | Yixin Nie | Kyle Richardson
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
BAHP: Benchmark of Assessing Word Embeddings in Historical Portuguese
Zuoyu Tian | Dylan Jarrett | Juan Escalona Torres | Patricia Amaral
Proceedings of the 5th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature

High quality distributional models can capture lexical and semantic relations between words. Hence, researchers design various intrinsic tasks to test whether such relations are captured. However, most of the intrinsic tasks are designed for modern languages, and there is a lack of evaluation methods for distributional models of historical corpora. In this paper, we conducted BAHP: a benchmark of assessing word embeddings in Historical Portuguese, which contains four types of tests: analogy, similarity, outlier detection, and coherence. We examined word2vec models generated from two historical Portuguese corpora in these four test sets. The results demonstrate that our test sets are capable of measuring the quality of vector space models and can provide a holistic view of the model’s ability to capture syntactic and semantic information. Furthermore, the methodology for the creation of our test sets can be easily extended to other historical languages.

pdf bib
Period Classification in Chinese Historical Texts
Zuoyu Tian | Sandra Kübler
Proceedings of the 5th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature

In this study, we study language change in Chinese Biji by using a classification task: classifying Ancient Chinese texts by time periods. Specifically, we focus on a unique genre in classical Chinese literature: Biji (literally “notebook” or “brush notes”), i.e., collections of anecdotes, quotations, etc., anything authors consider noteworthy, Biji span hundreds of years across many dynasties and conserve informal language in written form. For these reasons, they are regarded as a good resource for investigating language change in Chinese (Fang, 2010). In this paper, we create a new dataset of 108 Biji across four dynasties. Based on the dataset, we first introduce a time period classification task for Chinese. Then we investigate different feature representation methods for classification. The results show that models using contextualized embeddings perform best. An analysis of the top features chosen by the word n-gram model (after bleaching proper nouns) confirms that these features are informative and correspond to observations and assumptions made by historical linguists.

2020

pdf bib
CLUE: A Chinese Language Understanding Evaluation Benchmark
Liang Xu | Hai Hu | Xuanwei Zhang | Lu Li | Chenjie Cao | Yudong Li | Yechen Xu | Kai Sun | Dian Yu | Cong Yu | Yin Tian | Qianqian Dong | Weitang Liu | Bo Shi | Yiming Cui | Junyi Li | Jun Zeng | Rongzhao Wang | Weijian Xie | Yanting Li | Yina Patterson | Zuoyu Tian | Yiwen Zhang | He Zhou | Shaoweihua Liu | Zhe Zhao | Qipeng Zhao | Cong Yue | Xinrui Zhang | Zhengliang Yang | Kyle Richardson | Zhenzhong Lan
Proceedings of the 28th International Conference on Computational Linguistics

The advent of natural language understanding (NLU) benchmarks for English, such as GLUE and SuperGLUE allows new NLU models to be evaluated across a diverse set of tasks. These comprehensive benchmarks have facilitated a broad range of research and applications in natural language processing (NLP). The problem, however, is that most such benchmarks are limited to English, which has made it difficult to replicate many of the successes in English NLU for other languages. To help remedy this issue, we introduce the first large-scale Chinese Language Understanding Evaluation (CLUE) benchmark. CLUE is an open-ended, community-driven project that brings together 9 tasks spanning several well-established single-sentence/sentence-pair classification tasks, as well as machine reading comprehension, all on original Chinese text. To establish results on these tasks, we report scores using an exhaustive set of current state-of-the-art pre-trained Chinese models (9 in total). We also introduce a number of supplementary datasets and additional tools to help facilitate further progress on Chinese NLU. Our benchmark is released at https://www.cluebenchmarks.com

pdf bib
Offensive Language Detection Using Brown Clustering
Zuoyu Tian | Sandra Kübler
Proceedings of the 12th Language Resources and Evaluation Conference

In this study, we investigate the use of Brown clustering for offensive language detection. Brown clustering has been shown to be of little use when the task involves distinguishing word polarity in sentiment analysis tasks. In contrast to previous work, we train Brown clusters separately on positive and negative sentiment data, but then combine the information into a single complex feature per word. This way of representing words results in stable improvements in offensive language detection, when used as the only features or in combination with words or character n-grams. Brown clusters add important information, even when combined with words or character n-grams or with standard word embeddings in a convolutional neural network. However, we also found different trends between the two offensive language data sets we used.

pdf bib
Building a Treebank for Chinese Literature for Translation Studies
Hai Hu | Yanting Li | Yina Patterson | Zuoyu Tian | Yiwen Zhang | He Zhou | Sandra Kuebler | Chien-Jer Charles Lin
Proceedings of the 19th International Workshop on Treebanks and Linguistic Theories

2019

pdf bib
UM-IU@LING at SemEval-2019 Task 6: Identifying Offensive Tweets Using BERT and SVMs
Jian Zhu | Zuoyu Tian | Sandra Kübler
Proceedings of the 13th International Workshop on Semantic Evaluation

This paper describes the UM-IU@LING’s system for the SemEval 2019 Task 6: Offens-Eval. We take a mixed approach to identify and categorize hate speech in social media. In subtask A, we fine-tuned a BERT based classifier to detect abusive content in tweets, achieving a macro F1 score of 0.8136 on the test data, thus reaching the 3rd rank out of 103 submissions. In subtasks B and C, we used a linear SVM with selected character n-gram features. For subtask C, our system could identify the target of abuse with a macro F1 score of 0.5243, ranking it 27th out of 65 submissions.

pdf bib
Ensemble Methods to Distinguish Mainland and Taiwan Chinese
Hai Hu | Wen Li | He Zhou | Zuoyu Tian | Yiwen Zhang | Liang Zou
Proceedings of the Sixth Workshop on NLP for Similar Languages, Varieties and Dialects

This paper describes the IUCL system at VarDial 2019 evaluation campaign for the task of discriminating between Mainland and Taiwan variation of mandarin Chinese. We first build several base classifiers, including a Naive Bayes classifier with word n-gram as features, SVMs with both character and syntactic features, and neural networks with pre-trained character/word embeddings. Then we adopt ensemble methods to combine output from base classifiers to make final predictions. Our ensemble models achieve the highest F1 score (0.893) in simplified Chinese track and the second highest (0.901) in traditional Chinese track. Our results demonstrate the effectiveness and robustness of the ensemble methods.