Xinyu Chen


2023

pdf bib
Cross-Document Event Coreference Resolution on Discourse Structure
Xinyu Chen | Sheng Xu | Peifeng Li | Qiaoming Zhu
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Cross-document event coreference resolution (CD-ECR) is a task of clustering event mentions across multiple documents that refer to the same real-world events. Previous studies usually model the CD-ECR task as a pairwise similarity comparison problem by using different event mention features, and consider the highly similar event mention pairs in the same cluster as coreferent. In general, most of them only consider the local context of event mentions and ignore their implicit global information, thus failing to capture the interactions of long-distance event mentions. To address the above issue, we regard discourse structure as global information to further improve CD-ECR. First, we use a discourse rhetorical structure constructor to construct tree structures to represent documents. Then, we obtain shortest dependency paths from the tree structures to represent interactions between event mention pairs. Finally, we feed the above information to a multi-layer perceptron to capture the similarities of event mention pairs for resolving coreferent events. Experimental results on the ECB+ dataset show that our proposed model outperforms several baselines and achieves the competitive performance with the start-of-the-art baselines.

pdf bib
基于深加工语料库的《唐诗三百首》难度分级(The difficulty classification of ‘ Three Hundred Tang Poems ’ based on the deep processing corpus)
Yuyu Huang (黄宇宇) | Xinyu Chen (陈欣雨) | Minxuan Feng (冯敏萱) | Yunuo Wang (王禹诺) | Beiyuan Wang (蓓原王,) | Bin Li (李斌)
Proceedings of the 22nd Chinese National Conference on Computational Linguistics

“为辅助中小学教材及读本中唐诗的选取,本文基于对《唐诗三百首》分词、词性、典故标记的深加工语料库,据诗句可读性创新性地构建了分级标准,共分4层,共计8项可量化指标:字层(通假字)、词层(双字词)、句层(特殊句式、标题长度、诗句长度)、艺术层(典故、其他修辞、描写手法)。据以上8项指标对语料库中313首诗评分,建立基于量化特征的向量空间模型,以K-means聚类算法将诗歌聚类以对应小学、初中和高中3个学段的唐诗学习。”