Ying Liu


2024

pdf bib
Do PLMs and Annotators Share the Same Gender Bias? Definition, Dataset, and Framework of Contextualized Gender Bias
Shucheng Zhu | Bingjie Du | Jishun Zhao | Ying Liu | Pengyuan Liu
Proceedings of the 5th Workshop on Gender Bias in Natural Language Processing (GeBNLP)

Pre-trained language models (PLMs) have achieved success in various of natural language processing (NLP) tasks. However, PLMs also introduce some disquieting safety problems, such as gender bias. Gender bias is an extremely complex issue, because different individuals may hold disparate opinions on whether the same sentence expresses harmful bias, especially those seemingly neutral or positive. This paper first defines the concept of contextualized gender bias (CGB), which makes it easy to measure implicit gender bias in both PLMs and annotators. We then construct CGBDataset, which contains 20k natural sentences with gendered words, from Chinese news. Similar to the task of masked language models, gendered words are masked for PLMs and annotators to judge whether a male word or a female word is more suitable. Then, we introduce CGBFrame to measure the gender bias of annotators. By comparing the results measured by PLMs and annotators, we find that though there are differences on the choices made by PLMs and annotators, they show significant consistency in general.

pdf bib
Fantastic Semantics and Where to Find Them: Investigating Which Layers of Generative LLMs Reflect Lexical Semantics
Zhu Liu | Cunliang Kong | Ying Liu | Maosong Sun
Findings of the Association for Computational Linguistics: ACL 2024

Large language models have achieved remarkable success in general language understanding tasks. However, as a family of generative methods with the objective of next token prediction, the semantic evolution with the depth of these models are not fully explored, unlike their predecessors, such as BERT-like architectures. In this paper, we specifically investigate the bottom-up evolution of lexical semantics for a popular LLM, namely Llama2, by probing its hidden states at the end of each layer using a contextualized word identification task. Our experiments show that the representations in lower layers encode lexical semantics, while the higher layers, with weaker semantic induction, are responsible for prediction. This is in contrast to models with discriminative objectives, such as mask language modeling, where the higher layers obtain better lexical semantics. The conclusion is further supported by the monotonic increase in performance via the hidden states for the last meaningless symbols, such as punctuation, in the prompting strategy. Our codes are available at https://github.com/RyanLiut/LLM_LexSem.

pdf bib
Evaluating Moral Beliefs across LLMs through a Pluralistic Framework
Xuelin Liu | Yanfei Zhu | Shucheng Zhu | Pengyuan Liu | Ying Liu | Dong Yu
Findings of the Association for Computational Linguistics: EMNLP 2024

Proper moral beliefs are fundamental for language models, yet assessing these beliefs poses a significant challenge. This study introduces a novel three-module framework to evaluate the moral beliefs of four prominent large language models. Initially, we constructed a dataset containing 472 moral choice scenarios in Chinese, derived from moral words. The decision-making process of the models in these scenarios reveals their moral principle preferences. By ranking these moral choices, we discern the varying moral beliefs held by different language models. Additionally, through moral debates, we investigate the firmness of these models to their moral choices. Our findings indicate that English language models, namely ChatGPT and Gemini, closely mirror moral decisions of the sample of Chinese university students, demonstrating strong adherence to their choices and a preference for individualistic moral beliefs. In contrast, Chinese models such as Ernie and ChatGLM lean towards collectivist moral beliefs, exhibiting ambiguity in their moral choices and debates. This study also uncovers gender bias embedded within the moral beliefs of all examined language models. Our methodology offers an innovative means to assess moral beliefs in both artificial and human intelligence, facilitating a comparison of moral values across different cultures.

pdf bib
Clear Up Confusion: Advancing Cross-Domain Few-Shot Relation Extraction through Relation-Aware Prompt Learning
Ge Bai | Chenji Lu | Daichi Guo | Shilong Li | Ying Liu | Zhang Zhang | Guanting Dong | Ruifang Liu | Sun Yong
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 2: Short Papers)

Cross-domain few-shot Relation Extraction (RE) aims to transfer knowledge from a source domain to a different target domain to address low-resource problems.Previous work utilized label descriptions and entity information to leverage the knowledge of the source domain.However, these models are prone to confusion when directly applying this knowledge to a target domain with entirely new types of relations, which becomes particularly pronounced when facing similar relations.In this work, we propose a relation-aware prompt learning method with pre-training.Specifically, we empower the model to clear confusion by decomposing various relation types through an innovative label prompt, while a context prompt is employed to capture differences in different scenarios, enabling the model to further discern confusion. Two pre-training tasks are designed to leverage the prompt knowledge and paradigm.Experiments show that our method outperforms previous sota methods, yielding significantly better results on cross-domain few-shot RE tasks.

pdf bib
Fusion Makes Perfection: An Efficient Multi-Grained Matching Approach for Zero-Shot Relation Extraction
Shilong Li | Ge Bai | Zhang Zhang | Ying Liu | Chenji Lu | Daichi Guo | Ruifang Liu | Sun Yong
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 2: Short Papers)

Predicting unseen relations that cannot be observed during the training phase is a challenging task in relation extraction. Previous works have made progress by matching the semantics between input instances and label descriptions. However, fine-grained matching often requires laborious manual annotation, and rich interactions between instances and label descriptions come with significant computational overhead. In this work, we propose an efficient multi-grained matching approach that uses virtual entity matching to reduce manual annotation cost, and fuses coarse-grained recall and fine-grained classification for rich interactions with guaranteed inference speed.Experimental results show that our approach outperforms the previous State Of The Art (SOTA) methods, and achieves a balance between inference efficiency and prediction accuracy in zero-shot relation extraction tasks.Our code is available at https://github.com/longls777/EMMA.

pdf bib
Approaches and Challenges for Resolving Different Representations of Fictional Characters for Chinese Novels
Li Song | Ying Liu
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Due to the huge scale of literary works, automatic text analysis technologies are urgently needed for literary studies such as Digital Humanities. However, the domain-generality of existing NLP technologies limits their effectiveness on in-depth literary studies. It is valuable to explore how to adapt NLP technologies to the literary-specific tasks. Fictional characters are the most essential elements of a novel, and thus crucial to understanding the content of novels. The prerequisite of collecting a character’s information is to resolve its different representations. It is a specific problem of anaphora resolution which is a classical and open-domain NLP task. We adapt a state-of-the-art anaphora resolution model to resolve character representations in Chinese novels by making some modifications, and train a widely used BERT fine-tuned model for speaker extraction as assistance. We also analyze the challenges and potential solutions for character-resolution in Chinese novels according to the resolution results on a specific Chinese novel.

pdf bib
Quite Good, but Not Enough: Nationality Bias in Large Language Models - a Case Study of ChatGPT
Shucheng Zhu | Weikang Wang | Ying Liu
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

While nationality is a pivotal demographic element that enhances the performance of language models, it has received far less scrutiny regarding inherent biases. This study investigates nationality bias in ChatGPT (GPT-3.5), a large language model (LLM) designed for text generation. The research covers 195 countries, 4 temperature settings, and 3 distinct prompt types, generating 4,680 discourses about nationality descriptions in Chinese and English. Automated metrics were used to analyze the nationality bias, and expert annotators alongside ChatGPT itself evaluated the perceived bias. The results show that ChatGPT’s generated discourses are predominantly positive, especially compared to its predecessor, GPT-2. However, when prompted with negative inclinations, it occasionally produces negative content. Despite ChatGPT considering its generated text as neutral, it shows consistent self-awareness about nationality bias when subjected to the same pair-wise comparison annotation framework used by human annotators. In conclusion, while ChatGPT’s generated texts seem friendly and positive, they reflect the inherent nationality biases in the real world. This bias may vary across different language versions of ChatGPT, indicating diverse cultural perspectives. The study highlights the subtle and pervasive nature of biases within LLMs, emphasizing the need for further scrutiny.

2023

pdf bib
Adversarial Multi-task Learning for End-to-end Metaphor Detection
Shenglong Zhang | Ying Liu
Findings of the Association for Computational Linguistics: ACL 2023

Metaphor detection (MD) suffers from limited training data. In this paper, we started with a linguistic rule called Metaphor Identification Procedure and then proposed a novel multi-task learning framework to transfer knowledge in basic sense discrimination (BSD) to MD. BSD is constructed from word sense disambiguation (WSD), which has copious amounts of data. We leverage adversarial training to align the data distributions of MD and BSD in the same feature space, so task-invariant representations can be learned. To capture fine-grained alignment patterns, we utilize the multi-mode structures of MD and BSD. Our method is totally end-to-end and can mitigate the data scarcity problem in MD. Competitive results are reported on four public datasets. Our code and datasets are available.

pdf bib
Ambiguity Meets Uncertainty: Investigating Uncertainty Estimation for Word Sense Disambiguation
Zhu Liu | Ying Liu
Findings of the Association for Computational Linguistics: ACL 2023

Word sense disambiguation (WSD), which aims to determine an appropriate sense for a target word given its context, is crucial for natural language understanding. Existing supervised methods treat WSD as a classification task and have achieved remarkable performance. However, they ignore uncertainty estimation (UE) in the real-world setting, where the data is always noisy and out of distribution. This paper extensively studies UE on the benchmark designed for WSD. Specifically, we first compare four uncertainty scores for a state-of-the-art WSD model and verify that the conventional predictive probabilities obtained at the end of the model are inadequate to quantify uncertainty. Then, we examine the capability of capturing data and model uncertainties by the model with the selected UE score on well-designed test scenarios and discover that the model reflects data uncertainty satisfactorily but underestimates model uncertainty. Furthermore, we explore numerous lexical properties that intrinsically affect data uncertainty and provide a detailed analysis of four critical aspects: the syntactic category, morphology, sense granularity, and semantic relations.

pdf bib
Always the Best Fit: Adaptive Domain Gap Filling from Causal Perspective for Few-Shot Relation Extraction
Ge Bai | Chenji Lu | Jiaxiang Geng | Shilong Li | Yidong Shi | Xiyan Liu | Ying Liu | Zhang Zhang | Ruifang Liu
Findings of the Association for Computational Linguistics: EMNLP 2023

Cross-domain Relation Extraction aims to transfer knowledge from a source domain to a different target domain to address low-resource challenges. However, the semantic gap caused by data bias between domains is a major challenge, especially in few-shot scenarios. Previous work has mainly focused on transferring knowledge between domains through shared feature representations without analyzing the impact of each factor that may produce data bias based on the characteristics of each domain. This work takes a causal perspective and proposes a new framework CausalGF. By constructing a unified structural causal model, we estimating the causal effects of factors such as syntactic structure, label distribution,and entities on the outcome. CausalGF calculates the causal effects among the factors and adjusts them dynamically based on domain characteristics, enabling adaptive gap filling. Our experiments show that our approach better fills the domain gap, yielding significantly better results on the cross-domain few-shot relation extraction task.

pdf bib
Granularity Matters: Pathological Graph-driven Cross-modal Alignment for Brain CT Report Generation
Yanzhao Shi | Junzhong Ji | Xiaodan Zhang | Liangqiong Qu | Ying Liu
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

The automatic Brain CT reports generation can improve the efficiency and accuracy of diagnosing cranial diseases. However, current methods are limited by 1) coarse-grained supervision: the training data in image-text format lacks detailed supervision for recognizing subtle abnormalities, and 2) coupled cross-modal alignment: visual-textual alignment may be inevitably coupled in a coarse-grained manner, resulting in tangled feature representation for report generation. In this paper, we propose a novel Pathological Graph-driven Cross-modal Alignment (PGCA) model for accurate and robust Brain CT report generation. Our approach effectively decouples the cross-modal alignment by constructing a Pathological Graph to learn fine-grained visual cues and align them with textual words. This graph comprises heterogeneous nodes representing essential pathological attributes (i.e., tissue and lesion) connected by intra- and inter-attribute edges with prior domain knowledge. Through carefully designed graph embedding and updating modules, our model refines the visual features of subtle tissues and lesions and aligns them with textual words using contrastive learning. Extensive experimental results confirm the viability of our method. We believe that our PGCA model holds the potential to greatly enhance the automatic generation of Brain CT reports and ultimately contribute to improved cranial disease diagnosis.

2022

pdf bib
Analysis of Gender Bias in Social Perception and Judgement Using Chinese Word Embeddings
Jiali Li | Shucheng Zhu | Ying Liu | Pengyuan Liu
Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP)

Gender is a construction in line with social perception and judgment. An important means of this construction is through languages. When natural language processing tools, such as word embeddings, associate gender with the relevant categories of social perception and judgment, it is likely to cause bias and harm to those groups that do not conform to the mainstream social perception and judgment. Using 12,251 Chinese word embeddings as intermedium, this paper studies the relationship between social perception and judgment categories and gender. The results reveal that these grammatical gender-neutral Chinese word embeddings show a certain gender bias, which is consistent with the mainstream society’s perception and judgment of gender. Men are judged by their actions and perceived as bad, easily-disgusted, bad-tempered and rational roles while women are judged by their appearances and perceived as perfect, either happy or sad, and emotional roles.

pdf bib
中文自然语言处理多任务中的职业性别偏见测量(Measurement of Occupational Gender Bias in Chinese Natural Language Processing Tasks)
Mengqing Guo (郭梦清) | Jiali Li (李加厉) | Jishun Zhao (赵继舜) | Shucheng Zhu (朱述承) | Ying Liu (刘颖) | Pengyuan Liu (刘鹏远)
Proceedings of the 21st Chinese National Conference on Computational Linguistics

“尽管悲观者认为,职场中永远不可能存在性别平等。但随着人们观念的转变,愈来愈多的人们相信,职业的选择应只与个人能力相匹配,而不应由个体的性别决定。目前已经发现自然语言处理的各个任务中都存在着职业性别偏见。但这些研究往往只针对特定的英文任务,缺乏针对中文的、综合多任务的职业性别偏见测量研究。本文基于霍兰德职业模型,从中文自然语言处理中常见的三个任务出发,测量了词向量、共指消解和文本生成中的职业性别偏见,发现不同任务中的职业性别偏见既有一定的共性,又存在着独特的差异性。总体来看,不同任务中的职业性别偏见反映了现实生活中人们对于不同性别所选择职业的刻板印象。此外,在设计不同任务的偏见测量指标时,还需要考虑如语体、词序等语言学要素的影响。”

pdf bib
Cross-modal Contrastive Attention Model for Medical Report Generation
Xiao Song | Xiaodan Zhang | Junzhong Ji | Ying Liu | Pengxu Wei
Proceedings of the 29th International Conference on Computational Linguistics

Medical report automatic generation has gained increasing interest recently as a way to help radiologists write reports more efficiently. However, this image-to-text task is rather challenging due to the typical data biases: 1) Normal physiological structures dominate the images, with only tiny abnormalities; 2) Normal descriptions accordingly dominate the reports. Existing methods have attempted to solve these problems, but they neglect to exploit useful information from similar historical cases. In this paper, we propose a novel Cross-modal Contrastive Attention (CMCA) model to capture both visual and semantic information from similar cases, with mainly two modules: a Visual Contrastive Attention Module for refining the unique abnormal regions compared to the retrieved case images; a Cross-modal Attention Module for matching the positive semantic information from the case reports. Extensive experiments on two widely-used benchmarks, IU X-Ray and MIMIC-CXR, demonstrate that the proposed model outperforms the state-of-the-art methods on almost all metrics. Further analyses also validate that our proposed model is able to improve the reports with more accurate abnormal findings and richer descriptions.

pdf bib
Metaphor Detection via Linguistics Enhanced Siamese Network
Shenglong Zhang | Ying Liu
Proceedings of the 29th International Conference on Computational Linguistics

In this paper we present MisNet, a novel model for word level metaphor detection. MisNet converts two linguistic rules, i.e., Metaphor Identification Procedure (MIP) and Selectional Preference Violation (SPV) into semantic matching tasks. MIP module computes the similarity between the contextual meaning and the basic meaning of a target word. SPV module perceives the incongruity between target words and their contexts. To better represent basic meanings, MisNet utilizes dictionary resources. Empirical results indicate that MisNet achieves competitive performance on several datasets.

2021

pdf bib
SaGE: 基于句法感知图卷积神经网络和ELECTRA的中文隐喻识别模型(SaGE: Syntax-aware GCN with ELECTRA for Chinese Metaphor Detection)
Shenglong Zhang (张声龙) | Ying Liu (刘颖) | Yanjun Ma (马艳军)
Proceedings of the 20th Chinese National Conference on Computational Linguistics

隐喻是人类语言中经常出现的一种特殊现象,隐喻识别对于自然语言处理各项任务来说具有十分基础和重要的意义。针对中文领域的隐喻识别任务,我们提出了一种基于句法感知图卷积神经网络和ELECTRA的隐喻识别模型(Syntax-aware GCN withELECTRA SaGE)。该模型从语言学出发,使用ELECTRA和Transformer编码器抽取句子的语义特征,将句子按照依存关系组织成一张图并使用图卷积神经网络抽取其句法特征,在此基础上对两类特征进行融合以进行隐喻识别。我们的模型在CCL2018中文隐喻识别评测数据集上以85.22%的宏平均F1分数超越了此前的最佳成绩,验证了融合语义信息和句法信息对于隐喻识别任务具有重要作用。

pdf bib
Native Language Identification and Reconstruction of Native Language Relationship Using Japanese Learner Corpus
Mitsuhiro Nishijima | Ying Liu
Proceedings of the 35th Pacific Asia Conference on Language, Information and Computation

pdf bib
K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce
Song Xu | Haoran Li | Peng Yuan | Yujia Wang | Youzheng Wu | Xiaodong He | Ying Liu | Bowen Zhou
Findings of the Association for Computational Linguistics: EMNLP 2021

Existing pre-trained language models (PLMs) have demonstrated the effectiveness of self-supervised learning for a broad range of natural language processing (NLP) tasks. However, most of them are not explicitly aware of domain-specific knowledge, which is essential for downstream tasks in many domains, such as tasks in e-commerce scenarios. In this paper, we propose K-PLUG, a knowledge-injected pre-trained language model based on the encoder-decoder transformer that can be transferred to both natural language understanding and generation tasks. Specifically, we propose five knowledge-aware self-supervised pre-training objectives to formulate the learning of domain-specific knowledge, including e-commerce domain-specific knowledge-bases, aspects of product entities, categories of product entities, and unique selling propositions of product entities. We verify our method in a diverse range of e-commerce scenarios that require domain-specific knowledge, including product knowledge base completion, abstractive product summarization, and multi-turn dialogue. K-PLUG significantly outperforms baselines across the board, which demonstrates that the proposed method effectively learns a diverse set of domain-specific knowledge for both language understanding and generation tasks. Our code is available.

2020

pdf bib
用计量风格学方法考察《水浒传》的作者争议问题——以罗贯中《平妖传》为参照(Quantitive Stylistics Based Research on the Controversy of the Author of “Tales of the Marshes”: Comparing with “Pingyaozhuan” of Luo Guanzhong)
Li Song (宋丽) | Ying Liu (刘颖)
Proceedings of the 19th Chinese National Conference on Computational Linguistics

《水浒传》是独著还是合著,施耐庵和罗贯中是何关系一直存在争议。本文将其作者争议粗略归纳为施耐庵作、罗贯中作、施作罗续、罗作他续、施作罗改五种情况,以罗贯中的《平妖传》为参照,用假设检验、文本聚类、文本分类、波动风格计量等方法,结合对文本内容的分析,考察《水浒传》的写作风格,试图为其作者身份认定提供参考。结果显示,只有罗作他续的可能性大,即前70回为罗贯中所作,后由他人续写,其他四种情况可能性都较小。

pdf bib
Modularized Syntactic Neural Networks for Sentence Classification
Haiyan Wu | Ying Liu | Shaoyun Shi
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

This paper focuses on tree-based modeling for the sentence classification task. In existing works, aggregating on a syntax tree usually considers local information of sub-trees. In contrast, in addition to the local information, our proposed Modularized Syntactic Neural Network (MSNN) utilizes the syntax category labels and takes advantage of the global context while modeling sub-trees. In MSNN, each node of a syntax tree is modeled by a label-related syntax module. Each syntax module aggregates the outputs of lower-level modules, and finally, the root module provides the sentence representation. We design a tree-parallel mini-batch strategy for efficient training and predicting. Experimental results on four benchmark datasets show that our MSNN significantly outperforms previous state-of-the-art tree-based methods on the sentence classification task.

2019

pdf bib
Relation Extraction with Temporal Reasoning Based on Memory Augmented Distant Supervision
Jianhao Yan | Lin He | Ruqin Huang | Jian Li | Ying Liu
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Distant supervision (DS) is an important paradigm for automatically extracting relations. It utilizes existing knowledge base to collect examples for the relation we intend to extract, and then uses these examples to automatically generate the training data. However, the examples collected can be very noisy, and pose significant challenge for obtaining high quality labels. Previous work has made remarkable progress in predicting the relation from distant supervision, but typically ignores the temporal relations among those supervising instances. This paper formulates the problem of relation extraction with temporal reasoning and proposes a solution to predict whether two given entities participate in a relation at a given time spot. For this purpose, we construct a dataset called WIKI-TIME which additionally includes the valid period of a certain relation of two entities in the knowledge base. We propose a novel neural model to incorporate both the temporal information encoding and sequential reasoning. The experimental results show that, compared with the best of existing models, our model achieves better performance in both WIKI-TIME dataset and the well-studied NYT-10 dataset.

2015

pdf bib
A Corpus-Based Study of zunshou and Its English Equivalents
Ying Liu
Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation

2014

pdf bib
A Corpus-Based Quantitative Study of Nominalizations across Chinese and British Media English
Ying Liu | Alex Chengyu Fang | Naixing Wei
Proceedings of the 28th Pacific Asia Conference on Language, Information and Computing

2013

pdf bib
UMLS::Similarity: Measuring the Relatedness and Similarity of Biomedical Concepts
Bridget McInnes | Ted Pedersen | Serguei Pakhomov | Ying Liu | Genevieve Melton-Meaux
Proceedings of the 2013 NAACL HLT Demonstration Session

2011

pdf bib
Using Second-order Vectors in a Knowledge-based Method for Acronym Disambiguation
Bridget T. McInnes | Ted Pedersen | Ying Liu | Serguei V. Pakhomov | Genevieve B. Melton
Proceedings of the Fifteenth Conference on Computational Natural Language Learning

pdf bib
The Ngram Statistics Package (Text::NSP) : A Flexible Tool for Identifying Ngrams, Collocations, and Word Associations
Ted Pedersen | Satanjeev Banerjee | Bridget McInnes | Saiyam Kohli | Mahesh Joshi | Ying Liu
Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World