Guandong Xu

2025

Large language models (LLMs) have made significant strides in natural language processing by leveraging their ability to comprehend and reason with factual knowledge. However, a significant amount of factual knowledge is stored in structured data, which has unique characteristics not typically encountered in the unstructured texts used for pretraining LLMs. To evaluate the capability of LLMs in handling facts structurally stored, we introduce a benchmark called StructFact, which includes meticulously annotated factual questions, spanning five tasks that reflect the intrinsic properties of structured data. This benchmark aims to delineate the strengths and limitations of LLMs in reasoning with structured data for knowledge-intensive tasks in practical applications. Extensive experiments conducted on 10 common LLMs have yielded several insights, one notable finding being that these models struggle significantly with the heterogeneity of structured data during reasoning.

pdf bib abs

E-commerce authoring entails creating engaging, diverse, and targeted content to enhance preference elicitation and retrieval experience. While Large Language Models (LLMs) have revolutionized content generation, they often fall short in e-commerce applications due to their limited memorization of domain-specific features. This paper proposes LLaMA-E, the unified e-commerce authoring models that address the contextual preferences of customers, sellers, and platforms, the essential objects in e-commerce operation. We design the instruction set derived from tasks of ads generation, query-enhanced product title rewriting, product classification, purchase intent speculation, and general e-commerce Q&A. The instruction formulation ensures the interleaved cover of the presented and required object features, allowing the alignment of base models to parameterize e-commerce knowledge comprehensively. The proposed LLaMA-E models achieve state-of-the-art evaluation performance and exhibit the advantage in zero-shot practical applications. To our knowledge, this is the first LLM tailored to empower authoring applications with comprehensive scenario understanding by integrating features focused on participated objects.

2023

pdf bib abs

Abstract Meaning Representation (AMR) is a semantic representation that can enhance natural language generation (NLG) by providing a logical semantic input. In this paper, we propose the AMR-TST, an AMR-based text style transfer (TST) technique. The AMR-TST converts the source text to an AMR graph and generates the transferred text based on the AMR graph modified by a TST policy named style rewriting. Our method combines both the explainability and diversity of explicit and implicit TST methods. The experiments show that the proposed method achieves state-of-the-art results compared with other baseline models in automatic and human evaluations. The generated transferred text in qualitative evaluation proves the AMR-TST have significant advantages in keeping semantic features and reducing hallucinations. To the best of our knowledge, this work is the first to apply the AMR method focusing on node-level features to the TST task.

2022

pdf bib abs

Can Language Models Serve as Temporal Knowledge Bases?
Ruilin Zhao | Feng Zhao | Guandong Xu | Sixiao Zhang | Hai Jin
Findings of the Association for Computational Linguistics: EMNLP 2022

Recent progress regarding the use of language models (LMs) as knowledge bases (KBs) has shown that language models can act as structured knowledge bases for storing relational facts. However, most existing works only considered the LM-as-KB paradigm in a static setting, which ignores the analysis of temporal dynamics of world knowledge. Furthermore, a basic function of KBs, i.e., the ability to store conflicting information (i.e., 1-N, N-1, and N-M relations), is underexplored. In this paper, we formulate two practical requirements for treating LMs as temporal KBs: (i) The capacity to store temporally-scoped knowledge that contains conflicting information and (ii) the ability to use stored knowledge for temporally-scoped knowledge queries. We introduce a new dataset called LAMA-TK which is aimed at probing temporally-scoped knowledge, and investigate the two above requirements to explore the LM-as-KB paradigm in the temporal domain. On the one hand, experiments show that LMs can memorize millions of temporally-scoped facts with relatively high accuracy and transfer stored knowledge to temporal knowledge queries, thereby expanding the LM-as-KB paradigm to the temporal domain. On the other hand, we show that memorizing conflicting information, which has been neglected by previous works, is still challenging for LMs and hinders the memorization of other unrelated one-to-one relationships.

pdf bib abs

Medication recommendation is a crucial task for intelligent healthcare systems. Previous studies mainly recommend medications with electronic health records (EHRs). However, some details of interactions between doctors and patients may be ignored or omitted in EHRs, which are essential for automatic medication recommendation. Therefore, we make the first attempt to recommend medications with the conversations between doctors and patients. In this work, we construct DIALMED, the first high-quality dataset for medical dialogue-based medication recommendation task. It contains 11, 996 medical dialogues related to 16 common diseases from 3 departments and 70 corresponding common medications. Furthermore, we propose a Dialogue structure and Disease knowledge aware Network (DDN), where a QA Dialogue Graph mechanism is designed to model the dialogue structure and the knowledge graph is used to introduce external disease knowledge. The extensive experimental results demonstrate that the proposed method is a promising solution to recommend medications with medical dialogues. The dataset and code are available at https://github.com/f-window/DialMed.

2021

pdf bib abs

Harnessing Privileged Information for Hyperbole Detection
Rhys Biddle | Maciek Rybinski | Qian Li | Cecile Paris | Guandong Xu
Proceedings of the 19th Annual Workshop of the Australasian Language Technology Association

The detection of hyperbole is an important stepping stone to understanding the intentions of a hyperbolic utterance. We propose a model that combines pre-trained language models with privileged information for the task of hyperbole detection. We also introduce a suite of behavioural tests to probe the capabilities of hyperbole detection models across a range of hyperbole types. Our experiments show that our model improves upon baseline models on an existing hyperbole detection dataset. Probing experiments combined with analysis using local linear approximations (LIME) show that our model excels at detecting one particular type of hyperbole. Further, we discover that our experiments highlight annotation artifacts introduced through the process of literal paraphrasing of hyperbole. These annotation artifacts are likely to be a roadblock to further improvements in hyperbole detection.

2019

pdf bib abs

A Boundary-aware Neural Model for Nested Named Entity Recognition
Changmeng Zheng | Yi Cai | Jingyun Xu | Ho-fung Leung | Guandong Xu
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

In natural language processing, it is common that many entities contain other entities inside them. Most existing works on named entity recognition (NER) only deal with flat entities but ignore nested ones. We propose a boundary-aware neural model for nested NER which leverages entity boundaries to predict entity categorical labels. Our model can locate entities precisely by detecting boundaries using sequence labeling models. Based on the detected boundaries, our model utilizes the boundary-relevant regions to predict entity categorical labels, which can decrease computation cost and relieve error propagation problem in layered sequence labeling model. We introduce multitask learning to capture the dependencies of entity boundaries and their categorical labels, which helps to improve the performance of identifying entities. We conduct our experiments on GENIA dataset and the experimental results demonstrate that our model outperforms other state-of-the-art methods.