2024
pdf
bib
abs
Dynamic Planning for LLM-based Graphical User Interface Automation
Shaoqing Zhang
|
Zhuosheng Zhang
|
Kehai Chen
|
Xinbei Ma
|
Muyun Yang
|
Tiejun Zhao
|
Min Zhang
Findings of the Association for Computational Linguistics: EMNLP 2024
The advent of large language models (LLMs) has spurred considerable interest in advancing autonomous LLMs-based agents, particularly in intriguing applications within smartphone graphical user interfaces (GUIs). When presented with a task goal, these agents typically emulate human actions within a GUI environment until the task is completed. However, a key challenge lies in devising effective plans to guide action prediction in GUI tasks, though planning have been widely recognized as effective for decomposing complex tasks into a series of steps. Specifically, given the dynamic nature of environmental GUIs following action execution, it is crucial to dynamically adapt plans based on environmental feedback and action history.We show that the widely-used ReAct approach fails due to the excessively long historical dialogues. To address this challenge, we propose a novel approach called Dynamic Planning of Thoughts (D-PoT) for LLM-based GUI agents.D-PoT involves the dynamic adjustment of planning based on the environmental feedback and execution history. Experimental results reveal that the proposed D-PoT significantly surpassed the strong GPT-4V baseline by +12.7% (34.66% → 47.36%) in accuracy. The analysis highlights the generality of dynamic planning in different backbone LLMs, as well as the benefits in mitigating hallucinations and adapting to unseen tasks. Code is available at https://github.com/sqzhang-lazy/D-PoT.
pdf
bib
abs
Self-Evaluation of Large Language Model based on Glass-box Features
Hui Huang
|
Yingqi Qu
|
Jing Liu
|
Muyun Yang
|
Bing Xu
|
Tiejun Zhao
|
Wenpeng Lu
Findings of the Association for Computational Linguistics: EMNLP 2024
The proliferation of open-source Large Language Models (LLMs) underscores the pressing need for evaluation methods. Existing works primarily rely on external evaluators, focusing on training and prompting strategies. However, a crucial aspect – model-aware glass-box features – is overlooked. In this study, we explore the utility of glass-box features under the scenario of self-evaluation, namely applying an LLM to evaluate its own output. We investigate various glass-box feature groups and discovered that the softmax distribution serves as a reliable quality indicator for self-evaluation. Experimental results on public benchmarks validate the feasibility of self-evaluation of LLMs using glass-box features.
pdf
bib
abs
DUAL-REFLECT: Enhancing Large Language Models for Reflective Translation through Dual Learning Feedback Mechanisms
Andong Chen
|
Lianzhang Lou
|
Kehai Chen
|
Xuefeng Bai
|
Yang Xiang
|
Muyun Yang
|
Tiejun Zhao
|
Min Zhang
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Recently, large language models (LLMs) enhanced by self-reflection have achieved promising performance on machine transla004 tion. The key idea is guiding LLMs to generate translation with human-like feedback. However, existing self-reflection methods lack effective feedback information, limiting the translation performance. To address this, we introduce a DUAL-REFLECT framework, leveraging the dual learning of translation tasks to provide effective feedback, thereby enhancing the models’ self-reflective abilities and improving translation performance. The application of this method across various translation tasks has proven its effectiveness in improving translation accuracy and eliminating ambiguities, especially in translation tasks with low-resource language pairs.
2023
pdf
bib
abs
Improving Translation Quality Estimation with Bias Mitigation
Hui Huang
|
Shuangzhi Wu
|
Kehai Chen
|
Hui Di
|
Muyun Yang
|
Tiejun Zhao
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
State-of-the-art translation Quality Estimation (QE) models are proven to be biased. More specifically, they over-rely on monolingual features while ignoring the bilingual semantic alignment. In this work, we propose a novel method to mitigate the bias of the QE model and improve estimation performance. Our method is based on the contrastive learning between clean and noisy sentence pairs. We first introduce noise to the target side of the parallel sentence pair, forming the negative samples. With the original parallel pairs as the positive sample, the QE model is contrastively trained to distinguish the positive samples from the negative ones. This objective is jointly trained with the regression-style quality estimation, so as to prevent the QE model from overfitting to monolingual features. Experiments on WMT QE evaluation datasets demonstrate that our method improves the estimation performance by a large margin while mitigating the bias.
pdf
bib
abs
Iterative Nearest Neighbour Machine Translation for Unsupervised Domain Adaptation
Hui Huang
|
Shuangzhi Wu
|
Xinnian Liang
|
Zefan Zhou
|
Muyun Yang
|
Tiejun Zhao
Findings of the Association for Computational Linguistics: ACL 2023
Unsupervised domain adaptation of machine translation, which adapts a pre-trained translation model to a specific domain without in-domain parallel data, has drawn extensive attention in recent years. However, most existing methods focus on the fine-tuning based techniques, which is non-extensible. In this paper, we propose a new method to perform unsupervised domain adaptation in a non-parametric manner. Our method only resorts to in-domain monolingual data, and we jointly perform nearest neighbour inference on both forward and backward translation directions. The forward translation model creates nearest neighbour datastore for the backward direction, and vice versa, strengthening each other in an iterative style. Experiments on multi-domain datasets demonstrate that our method significantly improves the in-domain translation performance and achieves state-of-the-art results among non-parametric methods.
pdf
bib
abs
HIT-MI&T Lab’s Submission to Eval4NLP 2023 Shared Task
Rui Zhang
|
Fuhai Song
|
Hui Huang
|
Jinghao Yuan
|
Muyun Yang
|
Tiejun Zhao
Proceedings of the 4th Workshop on Evaluation and Comparison of NLP Systems
Recently, Large Language Models (LLMs) have boosted the research in natural language processing and shown impressive capabilities across numerous domains, including machine translation evaluation. This paper presents our methods developed for the machine translation evaluation sub-task of the Eval4NLP 2023 Shared Task. Based on the provided LLMs, we propose a generation-based method as well as a probability-based method to perform evaluation, explore different strategies when selecting the demonstrations for in-context learning, and try different ensemble methods to further improve the evaluation accuracy. The experiment results on the development set and test set demonstrate the effectiveness of our proposed method.
2022
pdf
bib
abs
中文专利关键信息语料库的构建研究(Research on the construction of Chinese patent key information corpus)
Wenting Zhang (张文婷)
|
Meihan Zhao (赵美含)
|
Yixuan Ma (马翊轩)
|
Wenrui Wang (王文瑞)
|
Yuzhe Liu (刘宇哲)
|
Muyun Yang (杨沐昀)
Proceedings of the 21st Chinese National Conference on Computational Linguistics
“专利文献是一种重要的技术文献,是知识产权强国的重要工作内容。目前专利语料库多集中于信息检索、机器翻译以及文本文分类等领域,尚缺乏更细粒度的标注,不足以支持问答、阅读理解等新形态的人工智能技术研发。本文面向专利智能分析的需要,提出了从解决问题、技术手段、效果三个角度对发明专利进行专利标注,并最终构建了包含313篇的中文专利关键信息语料库。利用命名实体识别技术对语料库关键信息进行识别和验证,表明专利关键信息的识别是不同于领域命名实体识别的更大粒度的信息抽取难题。”
2021
pdf
bib
Grammar-Based Patches Generation for Automated Program Repair
Yu Tang
|
Long Zhou
|
Ambrosio Blanco
|
Shujie Liu
|
Furu Wei
|
Ming Zhou
|
Muyun Yang
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
2020
pdf
bib
abs
Robust Machine Reading Comprehension by Learning Soft labels
Zhenyu Zhao
|
Shuangzhi Wu
|
Muyun Yang
|
Kehai Chen
|
Tiejun Zhao
Proceedings of the 28th International Conference on Computational Linguistics
Neural models have achieved great success on the task of machine reading comprehension (MRC), which are typically trained on hard labels. We argue that hard labels limit the model capability on generalization due to the label sparseness problem. In this paper, we propose a robust training method for MRC models to address this problem. Our method consists of three strategies, 1) label smoothing, 2) word overlapping, 3) distribution prediction. All of them help to train models on soft labels. We validate our approach on the representative architecture - ALBERT. Experimental results show that our method can greatly boost the baseline with 1% improvement in average, and achieve state-of-the-art performance on NewsQA and QUOREF.
pdf
bib
abs
End-to-End Speech Translation with Adversarial Training
Xuancai Li
|
Chen Kehai
|
Tiejun Zhao
|
Muyun Yang
Proceedings of the First Workshop on Automatic Simultaneous Translation
End-to-End speech translation usually leverages audio-to-text parallel data to train an available speech translation model which has shown impressive results on various speech translation tasks. Due to the artificial cost of collecting audio-to-text parallel data, the speech translation is a natural low-resource translation scenario, which greatly hinders its improvement. In this paper, we proposed a new adversarial training method to leverage target monolingual data to relieve the low-resource shortcoming of speech translation. In our method, the existing speech translation model is considered as a Generator to gain a target language output, and another neural Discriminator is used to guide the distinction between outputs of speech translation model and true target monolingual sentences. Experimental results on the CCMT 2019-BSTC dataset speech translation task demonstrate that the proposed methods can significantly improve the performance of the End-to-End speech translation system.
2017
pdf
bib
abs
Investigating the content and form of referring expressions in Mandarin: introducing the Mtuna corpus
Kees van Deemter
|
Le Sun
|
Rint Sybesma
|
Xiao Li
|
Bo Chen
|
Muyun Yang
Proceedings of the 10th International Conference on Natural Language Generation
East Asian languages are thought to handle reference differently from languages such as English, particularly in terms of the marking of definiteness and number. We present the first Data-Text corpus for Referring Expressions in Mandarin, and we use this corpus to test some initial hypotheses inspired by the theoretical linguistics literature. Our findings suggest that function words deserve more attention in Referring Expressions Generation than they have so far received, and they have a bearing on the debate about whether different languages make different trade-offs between clarity and brevity.
2015
pdf
bib
Hierarchical Recurrent Neural Network for Document Modeling
Rui Lin
|
Shujie Liu
|
Muyun Yang
|
Mu Li
|
Ming Zhou
|
Sheng Li
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing
2014
pdf
bib
Learning Topic Representation for SMT with Neural Networks
Lei Cui
|
Dongdong Zhang
|
Shujie Liu
|
Qiming Chen
|
Mu Li
|
Ming Zhou
|
Muyun Yang
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
2013
pdf
bib
A Hierarchical Semantics-Aware Distributional Similarity Scheme
Shuqi Sun
|
Ke Sun
|
Shiqi Zhao
|
Haifeng Wang
|
Muyun Yang
|
Sheng Li
Proceedings of the Sixth International Joint Conference on Natural Language Processing
pdf
bib
Repairing Incorrect Translation with Examples
Junguo Zhu
|
Muyun Yang
|
Sheng Li
|
Tiejun Zhao
Proceedings of the Sixth International Joint Conference on Natural Language Processing
2011
pdf
bib
Harvesting Related Entities with a Search Engine
Shuqi Sun
|
Shiqi Zhao
|
Muyun Yang
|
Haifeng Wang
|
Sheng Li
Proceedings of 5th International Joint Conference on Natural Language Processing
2010
pdf
bib
Reexamination on Potential for Personalization in Web Search
Daren Li
|
Muyun Yang
|
HaoLiang Qi
|
Sheng Li
|
Tiejun Zhao
Coling 2010: Posters
pdf
bib
Utilizing Variability of Time and Term Content, within and across Users in Session Detection
Shuqi Sun
|
Sheng Li
|
Muyun Yang
|
Haoliang Qi
|
Tiejun Zhao
Coling 2010: Posters
pdf
bib
All in Strings: a Powerful String-based Automatic MT Evaluation Metric with Multiple Granularities
Junguo Zhu
|
Muyun Yang
|
Bo Wang
|
Sheng Li
|
Tiejun Zhao
Coling 2010: Posters
2009
pdf
bib
References Extension for the Automatic Evaluation of MT by Syntactic Hybridization
Bo Wang
|
Tiejun Zhao
|
Muyun Yang
|
Sheng Li
Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation (SSST-3) at NAACL HLT 2009
pdf
bib
A Study of Translation Rule Classification for Syntax-based Statistical Machine Translation
Hongfei Jiang
|
Sheng Li
|
Muyun Yang
|
Tiejun Zhao
Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation (SSST-3) at NAACL HLT 2009
pdf
bib
A Statistical Machine Translation Model Based on a Synthetic Synchronous Grammar
Hongfei Jiang
|
Muyun Yang
|
Tiejun Zhao
|
Sheng Li
|
Bo Wang
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
2007
pdf
bib
HIT-WSD: Using Search Engine for Multilingual Chinese-English Lexical Sample Task
PengYuan Liu
|
TieJun Zhao
|
MuYun Yang
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)
2002
pdf
bib
Learning Chinese Bracketing Knowledge Based on a Bilingual Language Model
Yajuan Lü
|
Sheng Li
|
Tiejun Zhao
|
Muyun Yang
COLING 2002: The 19th International Conference on Computational Linguistics
2001
pdf
bib
Automatic Detection of Prosody Phrase Boundaries for Text-to-Speech System
Xin Lv
|
Tie-jun Zhao
|
Zhan-yi Liu
|
Mu-yun Yang
Proceedings of the Seventh International Workshop on Parsing Technologies
2000
pdf
bib
Statistics Based Hybrid Approach to Chinese Base Phrase Identification
Tie-jun Zhao
|
Mu-yun Yang
|
Fang Liu
|
Jian-min Yao
|
Hao Yu
Second Chinese Language Processing Workshop