Pengyuan Liu

Also published as: PengYuan Liu, Peng-Yuan Liu, 鹏远


pdf bib
What’s the most important value? INVP: INvestigating the Value Priorities of LLMs through Decision-making in Social Scenarios
Xuelin Liu | Pengyuan Liu | Dong Yu
Proceedings of the 31st International Conference on Computational Linguistics

As large language models (LLMs) demonstrate impressive performance in various tasks and are increasingly integrated into the decision-making process, ensuring they align with human values has become crucial. This paper highlights that value priorities—the relative importance of different value—play a pivotal role in the decision-making process. To explore the value priorities in LLMs, this paper introduces INVP, a framework for INvestigating Value Priorities through decision-making in social scenarios. The framework encompasses social scenarios including binary decision-making, covering both individual and collective decision-making contexts, and is based on Schwartz’s value theory for constructing value priorities. Using this framework, we construct a dataset, which contains a total of 1613 scenarios and 3226 decisions across 283 topics. We evaluate seven popular LLMs and the experimental results reveal commonalities in the value priorities across different LLMs, such as an emphasis on Universalism and Benevolence, while Power and Hedonism are typically given lower priority. This study provides fresh insights into understanding and enhancing the moral and value alignment of LLMs when making complex social decisions.


pdf bib
MatPlotAgent: Method and Evaluation for LLM-Based Agentic Scientific Data Visualization
Zhiyu Yang | Zihan Zhou | Shuo Wang | Xin Cong | Xu Han | Yukun Yan | Zhenghao Liu | Zhixing Tan | Pengyuan Liu | Dong Yu | Zhiyuan Liu | Xiaodong Shi | Maosong Sun
Findings of the Association for Computational Linguistics: ACL 2024

Scientific data visualization plays a crucial role in research by enabling the direct display of complex information and assisting researchers in identifying implicit patterns. Despite its importance, the use of Large Language Models (LLMs) for scientific data visualization remains rather unexplored. In this study, we introduce MatPlotAgent, an efficient model-agnostic LLM agent framework designed to automate scientific data visualization tasks. Leveraging the capabilities of both code LLMs and multi-modal LLMs, MatPlotAgent consists of three core modules: query understanding, code generation with iterative debugging, and a visual feedback mechanism for error correction. To address the lack of benchmarks in this field, we present MatPlotBench, a high-quality benchmark consisting of 100 human-verified test cases. Additionally, we introduce a scoring approach that utilizes GPT-4V for automatic evaluation. Experimental results demonstrate that MatPlotAgent can improve the performance of various LLMs, including both commercial and open-source models. Furthermore, the proposed evaluation method shows a strong correlation with human-annotated scores.

pdf bib
Evaluating Moral Beliefs across LLMs through a Pluralistic Framework
Xuelin Liu | Yanfei Zhu | Shucheng Zhu | Pengyuan Liu | Ying Liu | Dong Yu
Findings of the Association for Computational Linguistics: EMNLP 2024

Proper moral beliefs are fundamental for language models, yet assessing these beliefs poses a significant challenge. This study introduces a novel three-module framework to evaluate the moral beliefs of four prominent large language models. Initially, we constructed a dataset containing 472 moral choice scenarios in Chinese, derived from moral words. The decision-making process of the models in these scenarios reveals their moral principle preferences. By ranking these moral choices, we discern the varying moral beliefs held by different language models. Additionally, through moral debates, we investigate the firmness of these models to their moral choices. Our findings indicate that English language models, namely ChatGPT and Gemini, closely mirror moral decisions of the sample of Chinese university students, demonstrating strong adherence to their choices and a preference for individualistic moral beliefs. In contrast, Chinese models such as Ernie and ChatGLM lean towards collectivist moral beliefs, exhibiting ambiguity in their moral choices and debates. This study also uncovers gender bias embedded within the moral beliefs of all examined language models. Our methodology offers an innovative means to assess moral beliefs in both artificial and human intelligence, facilitating a comparison of moral values across different cultures.

pdf bib
Do PLMs and Annotators Share the Same Gender Bias? Definition, Dataset, and Framework of Contextualized Gender Bias
Shucheng Zhu | Bingjie Du | Jishun Zhao | Ying Liu | Pengyuan Liu
Proceedings of the 5th Workshop on Gender Bias in Natural Language Processing (GeBNLP)

Pre-trained language models (PLMs) have achieved success in various of natural language processing (NLP) tasks. However, PLMs also introduce some disquieting safety problems, such as gender bias. Gender bias is an extremely complex issue, because different individuals may hold disparate opinions on whether the same sentence expresses harmful bias, especially those seemingly neutral or positive. This paper first defines the concept of contextualized gender bias (CGB), which makes it easy to measure implicit gender bias in both PLMs and annotators. We then construct CGBDataset, which contains 20k natural sentences with gendered words, from Chinese news. Similar to the task of masked language models, gendered words are masked for PLMs and annotators to judge whether a male word or a female word is more suitable. Then, we introduce CGBFrame to measure the gender bias of annotators. By comparing the results measured by PLMs and annotators, we find that though there are differences on the choices made by PLMs and annotators, they show significant consistency in general.


pdf bib
中国社会道德变化模型与发展动因探究——基于70年《人民日报》的计量与分析 (The Model of Moral Change and Motivation in Chinese Society ——The Vocabulary Analysis of the 70-year ”People’s Daily”)
Hongrui Wang (王弘睿) | Dong Yu (于东) | Pengyuan Liu (刘鹏远) | Liying Ceng (曾立英)
Proceedings of the 22nd Chinese National Conference on Computational Linguistics


pdf bib
动词视角下的汉语性别表征研究——基于多语体语料库与依存分析(Gendered Representation in Chinese via Verbal Analysis —Based on a Multi-register Corpus and Dependency Parsing)
Yingshi Chen (陈颖诗) | Dong Yu (于东) | Pengyuan Liu (刘鹏远)
Proceedings of the 22nd Chinese National Conference on Computational Linguistics


pdf bib
大规模语言模型增强的中文篇章多维度阅读体验量化研究(Quantitative Research on Multi-dimensional Reading Experience of Chinese Texts Enhanced by Large Language Model)
Jiadai Sun (孙嘉黛) | Siyi Tang (汤思怡) | Shike Wang (王诗可) | Dong Yu (于东) | Pengyuan Liu (刘鹏远)
Proceedings of the 22nd Chinese National Conference on Computational Linguistics



pdf bib
CoreValue:面向价值观计算的中文核心价值-行为体系及知识库(CoreValue: Chinese Core Value-Behavior Frame and Knowledge Base for Value Computing)
Pengyuan Liu (刘鹏远) | Sanle Zhang (张三乐) | Dong Yu (于东) | Lin Bo (薄琳)
Proceedings of the 21st Chinese National Conference on Computational Linguistics


pdf bib
中文自然语言处理多任务中的职业性别偏见测量(Measurement of Occupational Gender Bias in Chinese Natural Language Processing Tasks)
Mengqing Guo (郭梦清) | Jiali Li (李加厉) | Jishun Zhao (赵继舜) | Shucheng Zhu (朱述承) | Ying Liu (刘颖) | Pengyuan Liu (刘鹏远)
Proceedings of the 21st Chinese National Conference on Computational Linguistics


pdf bib
From Polarity to Intensity: Mining Morality from Semantic Space
Chunxu Zhao | Pengyuan Liu | Dong Yu
Proceedings of the 29th International Conference on Computational Linguistics

Most works on computational morality focus on moral polarity recognition, i.e., distinguishing right from wrong. However, a discrete polarity label is not informative enough to reflect morality as it does not contain any degree or intensity information. Existing approaches to compute moral intensity are limited to word-level measurement and heavily rely on human labelling. In this paper, we propose MoralScore, a weakly-supervised framework that can automatically measure moral intensity from text. It only needs moral polarity labels, which are more robust and easier to acquire. Besides, the framework can capture latent moral information not only from words but also from sentence-level semantics which can provide a more comprehensive measurement. To evaluate the performance of our method, we introduce a set of evaluation metrics and conduct extensive experiments. Results show that our method achieves good performance on both automatic and human evaluations.

pdf bib
Analysis of Gender Bias in Social Perception and Judgement Using Chinese Word Embeddings
Jiali Li | Shucheng Zhu | Ying Liu | Pengyuan Liu
Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP)

Gender is a construction in line with social perception and judgment. An important means of this construction is through languages. When natural language processing tools, such as word embeddings, associate gender with the relevant categories of social perception and judgment, it is likely to cause bias and harm to those groups that do not conform to the mainstream social perception and judgment. Using 12,251 Chinese word embeddings as intermedium, this paper studies the relationship between social perception and judgment categories and gender. The results reveal that these grammatical gender-neutral Chinese word embeddings show a certain gender bias, which is consistent with the mainstream society’s perception and judgment of gender. Men are judged by their actions and perceived as bad, easily-disgusted, bad-tempered and rational roles while women are judged by their appearances and perceived as perfect, either happy or sad, and emotional roles.

pdf bib
CLGC: A Corpus for Chinese Literary Grace Evaluation
Yi Li | Dong Yu | Pengyuan Liu
Proceedings of the Thirteenth Language Resources and Evaluation Conference

In this paper, we construct a Chinese literary grace corpus, CLGC, with 10,000 texts and more than 1.85 million tokens. Multi-level annotations are provided for each text in our corpus, including literary grace level, sentence category, and figure-of-speech type. Based on the corpus, we dig deep into the correlation between fine-grained features (semantic information, part-of-speech and figure-of-speech, etc.) and literary grace level. We also propose a new Literary Grace Evaluation (LGE) task, which aims at making a comprehensive assessment of the literary grace level according to the text. In the end, we build some classification models with machine learning algorithms (such as SVM, TextCNN) to prove the effectiveness of our features and corpus for LGE. The results of our preliminary classification experiments have achieved 79.71% on the weighted average F1-score.


pdf bib
中文句子级性别无偏数据集构建及预训练语言模型的性别偏度评估(Construction of Chinese Sentence-Level Gender-Unbiased Data Set and Evaluation of Gender Bias in Pre-Training Language)
Jishun Zhao (赵继舜) | Bingjie Du (杜冰洁) | Shucheng Zhu (朱述承) | Pengyuan Liu (刘鹏远)
Proceedings of the 20th Chinese National Conference on Computational Linguistics


pdf bib
中文关系抽取的句级语言学特征探究(A Probe into the Sentence-level Linguistic Features of Chinese Relation Extraction)
Baixi Xing (邢百西) | Jishun Zhao (赵继舜) | Pengyuan Liu (刘鹏远)
Proceedings of the 20th Chinese National Conference on Computational Linguistics


pdf bib
A Comparative Study of Collocation Extraction Methods from the Perspectives of Vocabulary and Grammar: A Case Study in the Field of Journalism
Lulu Gu | Yue Pan | Pengyuan Liu
Proceedings of the 35th Pacific Asia Conference on Language, Information and Computation

pdf bib
BLCUFIGHT at SemEval-2021 Task 10: Novel Unsupervised Frameworks For Source-Free Domain Adaptation
Weikang Wang | Yi Wu | Yixiang Liu | Pengyuan Liu
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)

Domain adaptation assumes that samples from source and target domains are freely accessible during a training phase. However, such assumption is rarely plausible in the real-world and may causes data-privacy issues, especially when the label of the source domain can be a sensitive attribute as an identifier. SemEval-2021 task 10 focuses on these issues. We participate in the task and propose novel frameworks based on self-training method. In our systems, two different frameworks are designed to solve text classification and sequence labeling. These approaches are tested to be effective which ranks the third among all system in subtask A, and ranks the first among all system in subtask B.


pdf bib
基于语料库的武侠与仙侠网络小说文体、词汇及主题对比分析(A Corpus-based Contrastive Analysis of Style, Vocabulary and Theme of Wuxia and Xianxia Internet Novels)
Sanle Zhang (张三乐) | Pengyuan Liu (刘鹏远) | Hu Zhang (张虎)
Proceedings of the 19th Chinese National Conference on Computational Linguistics


pdf bib
基于计量的百年中国人名用字性别特征研究(A Quantified Research on Gender Characteristics of Chinese Names in A Century)
Bingjie Du (杜冰洁) | Pengyuan Liu (刘鹏远) | Yongsheng Tian (田永胜)
Proceedings of the 19th Chinese National Conference on Computational Linguistics


pdf bib
伟大的男人和倔强的女人:基于语料库的形容词性别偏度历时研究(Great Males and Stubborn Females: A Diachronic Study of Corpus-Based Gendered Skewness in Chinese Adjectives)
Shucheng Zhu (朱述承) | Pengyuan Liu (刘鹏远)
Proceedings of the 19th Chinese National Conference on Computational Linguistics


pdf bib
小样本关系分类研究综述(Few-Shot Relation Classification: A Survey)
Han Hu (胡晗) | Pengyuan Liu (刘鹏远)
Proceedings of the 19th Chinese National Conference on Computational Linguistics

关系分类作为构建结构化知识的重要一环,在自然语言处理领域备受关注。但在很多应用领域中(医疗、金融领域),收集充足的用于训练关系分类模型的数据是十分困难的。近年来,仅需要少量训练样本的小样本学习研究逐渐新兴于各大领域。本文对近期小样本关系分类模型与方法进行了系统的综述。根据度量方法的不同,将现有方法分为原型式和分布式两大类。根据是否利用额外信息,将模型分为预训练和非预训练两大类。此外,除了常规设定下的小样本学习,本文还梳理了跨领域和稀缺资源场景下的小样本学习,并探讨了目前小样本关系分类方法的局限性,分析了跨领域小样本 学习面临的技术挑战。最后,展望了小样本关系分类未来的发展方向。

pdf bib
CDCPP:跨领域中文标点符号预测(CDCPP: Cross-Domain Chinese Punctuation Prediction)
Pengyuan Liu (刘鹏远) | Weikang Wang (王伟康) | Likun Qiu (邱立坤) | Bingjie Du (杜冰洁)
Proceedings of the 19th Chinese National Conference on Computational Linguistics


pdf bib
多目标情感分类中文数据集构建及分析研究(Construction and Analysis of Chinese Multi-Target Sentiment Classification Dataset)
Pengyuan Liu (刘鹏远) | Yongsheng Tian (田永胜) | Chengyu Du (杜成玉) | Likun Qiu (邱立坤)
Proceedings of the 19th Chinese National Conference on Computational Linguistics


pdf bib
Sensorimotor Enhanced Neural Network for Metaphor Detection
Mingyu Wan | Baixi Xing | Qi Su | Pengyuan Liu | Chu-Ren Huang
Proceedings of the 34th Pacific Asia Conference on Language, Information and Computation

pdf bib
Imbalanced Chinese Multi-label Text Classification Based on Alternating Attention
Hongliang Bi | Han Hu | Pengyuan Liu
Proceedings of the 34th Pacific Asia Conference on Language, Information and Computation


pdf bib
A Corpus-based Comparatively Study on the Semantic Features and Syntactic patterns of Yòu/Hái in Mandarin Chinese
Yuncui Zhang | Pengyuan Liu
Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation: Posters


pdf bib
Combining Constituent and Dependency Syntactic Views for Chinese Semantic Role Labeling
Shiqi Li | Qin Lu | Tiejun Zhao | Pengyuan Liu | Hanjing Li
Coling 2010: Posters

pdf bib
Head-modifier Relation based Non-lexical Reordering Model for Phrase-Based Translation
Shui Liu | Sheng Li | Tiejun Zhao | Min Zhang | Pengyuan Liu
Coling 2010: Posters

pdf bib
PKU_HIT: An Event Detection System Based on Instances Expansion and Rich Syntactic Features
Shiqi Li | Pengyuan Liu | Tiejun Zhao | Qin Lu | Hanjing Li
Proceedings of the 5th International Workshop on Semantic Evaluation

pdf bib
PengYuan@PKU: Extracting Infrequent Sense Instance with the Same N-Gram Pattern for the SemEval-2010 Task 15
Peng-Yuan Liu | Shi-Wen Yu | Shui Liu | Tie-Jun Zhao
Proceedings of the 5th International Workshop on Semantic Evaluation


pdf bib
HIT-WSD: Using Search Engine for Multilingual Chinese-English Lexical Sample Task
PengYuan Liu | TieJun Zhao | MuYun Yang
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)