Chao Xu


2022

pdf bib
The First International Ancient Chinese Word Segmentation and POS Tagging Bakeoff: Overview of the EvaHan 2022 Evaluation Campaign
Bin Li | Yiguo Yuan | Jingya Lu | Minxuan Feng | Chao Xu | Weiguang Qu | Dongbo Wang
Proceedings of the Second Workshop on Language Technologies for Historical and Ancient Languages

This paper presents the results of the First Ancient Chinese Word Segmentation and POS Tagging Bakeoff (EvaHan), which was held at the Second Workshop on Language Technologies for Historical and Ancient Languages (LT4HALA) 2022, in the context of the 13th Edition of the Language Resources and Evaluation Conference (LREC 2022). We give the motivation for having an international shared contest, as well as the data and tracks. The contest is consisted of two modalities, closed and open. In the closed modality, the participants are only allowed to use the training data, obtained the highest F1 score of 96.03% and 92.05% in word segmentation and POS tagging. In the open modality, the participants can use whatever resource they have, with the highest F1 score of 96.34% and 92.56% in word segmentation and POS tagging. The scores on the blind test dataset decrease around 3 points, which shows that the out-of-vocabulary words still are the bottleneck for lexical analyzers.

pdf bib
Drum Up SUPPORT: Systematic Analysis of Image-Schematic Conceptual Metaphors
Lennart Wachowiak | Dagmar Gromann | Chao Xu
Proceedings of the 3rd Workshop on Figurative Language Processing (FLP)

Conceptual metaphors represent a cognitive mechanism to transfer knowledge structures from one onto another domain. Image-schematic conceptual metaphors (ISCMs) specialize on transferring sensorimotor experiences to abstract domains. Natural language is believed to provide evidence of such metaphors. However, approaches to verify this hypothesis largely rely on top-down methods, gathering examples by way of introspection, or on manual corpus analyses. In order to contribute towards a method that is systematic and can be replicated, we propose to bring together existing processing steps in a pipeline to detect ISCMs, exemplified for the image schema SUPPORT in the COVID-19 domain. This pipeline consist of neural metaphor detection, dependency parsing to uncover construction patterns, clustering, and BERT-based frame annotation of dependent constructions to analyse ISCMs.

2020

pdf bib
A Cognitively Motivated Approach to Spatial Information Extraction
Chao Xu | Emmanuelle-Anna Dietz Saldanha | Dagmar Gromann | Beihai Zhou
Proceedings of the Third International Workshop on Spatial Language Understanding

Automatic extraction of spatial information from natural language can boost human-centered applications that rely on spatial dynamics. The field of cognitive linguistics has provided theories and cognitive models to address this task. Yet, existing solutions tend to focus on specific word classes, subject areas, or machine learning techniques that cannot provide cognitively plausible explanations for their decisions. We propose an automated spatial semantic analysis (ASSA) framework building on grammar and cognitive linguistic theories to identify spatial entities and relations, bringing together methods of spatial information extraction and cognitive frameworks on spatial language. The proposed rule-based and explainable approach contributes constructions and preposition schemas and outperforms previous solutions on the CLEF-2017 standard dataset.