Yulin Yuan


2024

pdf bib
Findings of the WMT 2024 Shared Task on Discourse-Level Literary Translation
Longyue Wang | Siyou Liu | Chenyang Lyu | Wenxiang Jiao | Xing Wang | Jiahao Xu | Zhaopeng Tu | Yan Gu | Weiyu Chen | Minghao Wu | Liting Zhou | Philipp Koehn | Andy Way | Yulin Yuan
Proceedings of the Ninth Conference on Machine Translation

Translating literary works has perennially stood as an elusive dream in machine translation (MT), a journey steeped in intricate challenges. To foster progress in this domain, we hold a new shared task at WMT 2023, the second edition of the Discourse-Level Literary Translation. First, we (Tencent AI Lab and China Literature Ltd.) release a copyrighted and document-level Chinese-English web novel corpus. Furthermore, we put forth an industry-endorsed criteria to guide human evaluation process. This year, we totally received 10 submissions from 5 academia and industry teams. We employ both automatic and human evaluations to measure the performance of the submitted systems. The official ranking of the systems is based on the overall human judgments. In addition, our extensive analysis reveals a series of interesting findings on literary and discourse-aware MT. We release data, system outputs, and leaderboard at https://www2.statmt.org/wmt24/literary-translation-task.html.

2023

pdf bib
Findings of the WMT 2023 Shared Task on Discourse-Level Literary Translation: A Fresh Orb in the Cosmos of LLMs
Longyue Wang | Zhaopeng Tu | Yan Gu | Siyou Liu | Dian Yu | Qingsong Ma | Chenyang Lyu | Liting Zhou | Chao-Hong Liu | Yufeng Ma | Weiyu Chen | Yvette Graham | Bonnie Webber | Philipp Koehn | Andy Way | Yulin Yuan | Shuming Shi
Proceedings of the Eighth Conference on Machine Translation

Translating literary works has perennially stood as an elusive dream in machine translation (MT), a journey steeped in intricate challenges. To foster progress in this domain, we hold a new shared task at WMT 2023, the first edition of the Discourse-Level Literary Translation. First, we (Tencent AI Lab and China Literature Ltd.) release a copyrighted and document-level Chinese-English web novel corpus. Furthermore, we put forth an industry-endorsed criteria to guide human evaluation process. This year, we totally received 14 submissions from 7 academia and industry teams. We employ both automatic and human evaluations to measure the performance of the submitted systems. The official ranking of the systems is based on the overall human judgments. In addition, our extensive analysis reveals a series of interesting findings on literary and discourse-aware MT. We release data, system outputs, and leaderboard at http://www2.statmt.org/wmt23/literary-translation-task.html.

2022

pdf bib
双重否定结构自动识别研究(The Research on Automatic Recognition of the Double Negation Structure)
Yu Wang (王昱) | Yulin Yuan (袁毓林)
Proceedings of the 21st Chinese National Conference on Computational Linguistics

“双重否定结构是一种“通过两次否定表示肯定意义”的特殊结构,其存在会对自然语言处理中的语义判断与情感分类产生重要影响。本文以“eg eg P== extgreater P”为标准,对现代汉语中所有的“否定词+否定词”结构进行了遍历研究,将双重否定结构按照格式分为了3大类,25小类,常用双重否定结构或构式132个。结合动词的叙实性、否定焦点、语义否定与语用否定等相关理论,本文归纳了双重否定结构的三大成立条件,并据此设计实现了基于规则的双重否定结构自动识别程序。程序实验的精确率为98.85%,召回率为98.90%,F1值为98.85%。同时,程序还从96281句语料中获得了8640句精确率约为99%的含有双重否定结构的句子,为后续基于统计的深度学习模型提供了语料支持的可能。”

2020

pdf bib
《动词句法语义信息词典》知识内容说明书(An Introduction to the Syntactic-Semantic Knowledge-Base of Chinese Verbs)
Yulin Yuan (袁毓林) | Hong Cao (曹宏)
Proceedings of the 19th Chinese National Conference on Computational Linguistics

本文首先介绍《实词信息词典》的研制目标与结构内容,重点介绍其中的《动词信息词典》的体系结构与理论背景;然后,介绍《动词信息词典》所区分的8种动词小类及其定义,其为动词所设置的22种语义角色及其定义,由这些语义角色的不同配置而造成的20来种句法格式及其例句,其所考察的动词的9种主要的语法功能及其对于该词类的隶属度;最后,给出《动词信息词典》中检索系统的界面截图,交代其相应的纸质版本的情况。

2015

pdf bib
Linguistic Knowledge-driven Approach to Chinese Comparative Elements Extraction
MinJun Park | Yulin Yuan
Proceedings of the Eighth SIGHAN Workshop on Chinese Language Processing

2012

pdf bib
To Construct the Interpretation Templates for the Chinese Noun Compounds Based on Semantic Classes and Qualia Structures
Xue Wei | Yulin Yuan
Proceedings of the 26th Pacific Asia Conference on Language, Information, and Computation