Proceedings of Bridging Neurons and Symbols for Natural Language Processing and Knowledge Graphs Reasoning @ COLING 2025

Kang Liu, Yangqiu Song, Zhen Han, Rafet Sifa, Shizhu He, Yunfei Long (Editors)


Anthology ID:
2025.neusymbridge-1
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Venues:
NeusymBridge | WS
SIG:
Publisher:
ELRA and ICCL
URL:
https://aclanthology.org/2025.neusymbridge-1/
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
https://aclanthology.org/2025.neusymbridge-1.pdf

pdf bib
Proceedings of Bridging Neurons and Symbols for Natural Language Processing and Knowledge Graphs Reasoning @ COLING 2025
Kang Liu | Yangqiu Song | Zhen Han | Rafet Sifa | Shizhu He | Yunfei Long

pdf bib
Chain of Knowledge Graph: Information-Preserving Multi-Document Summarization for Noisy Documents
Kangil Lee | Jinwoo Jang | Youngjin Lim | Minsu Shin

With the advent of large language models, the complexity of multi-document summarization task has been substantially reduced. The summarization process must effectively handle noisy documents that are irrelevant to the main topic while preserving essential information. Recently, Chain-of-Density (COD) and Chain-of-Event (CoE) have proposed prompts to effectively handle the noisy documents by using entity-centric approaches for the summarization. However, CoD and CoE are prone to information loss during entity extraction due to their tendency to overly filter out entities perceived as less critical but that could still be important. In this paper, we propose a novel instruction prompt termed as Chain of Knowledge Graph (CoKG) for multi-document summarization. Our prompt extracts entities and constructs relationships between entities to form a Knowledge Graph (KG). Next, the prompt enriches these relationships to recognize potentially important entities and assess the strength of each relation. If the acquired KG meets a predefined quality level, the KG is used to summarize the given documents. This process helps alleviate the information loss in multi-document summarization. Experimental results demonstrate that our prompt effectively preserves key entities and is robust to noisy documents.

pdf bib
CEGRL-TKGR: A Causal Enhanced Graph Representation Learning Framework for Temporal Knowledge Graph Reasoning
Jinze Sun | Yongpan Sheng | Lirong He | Yongbin Qin | Ming Liu | Tao Jia

Temporal knowledge graph reasoning (TKGR) is increasingly gaining attention for its ability to extrapolate new events from historical data, thereby enriching the inherently incomplete temporal knowledge graphs. Existing graph-based representation learning frameworks have made significant strides in developing evolving representations for both entities and relational embeddings. Despite these achievements, there’s a notable tendency in these models to inadvertently learn biased data representations and mine spurious correlations, consequently failing to discern the causal relationships between events. This often leads to incorrect predictions based on these false correlations. To address this, we propose an innovative Causal Enhanced Graph Representation Learning framework for TKGR (named CEGRL-TKGR). This framework introduces causal structures in graph-based representation learning to unveil the essential causal relationships between events, ultimately enhancing the performance of the TKGR task. Specià̄¬cally, we first disentangle the evolutionary representations of entities and relations in a temporal knowledge graph sequence into two distinct components, namely causal representations and confounding representations. Then, drawing on causal intervention theory, we advocate the utilization of causal representations for predictions, aiming to mitigate the effects of erroneous correlations caused by confounding features, thus achieving more robust and accurate predictions. Finally, extensive experimental results on six benchmark datasets demonstrate the superior performance of our model in the link prediction task.

pdf bib
Reasoning Knowledge Filter for Logical Table-to-Text Generation
Yu Bai | Baoqiang Liu | Shuang Xue | Fang Cai | Na Ye | Guiping Zhang

Logical table-to-text generation (LT2T) seeks to produce logically faithful textual descriptions base on tables. Current end-to-end LT2T models, which use descriptions directly as learning objectives, frequently face challenges in maintaining logical faithfulness due to the lack of a reasoning knowledge. Recent research have introduced reasoning knowledge generated by models for LT2T task, but the noise along with it limited its performance. We therefore propose a framework reasoning knowledge filter that leverages the collaboration between large language models and smaller models to filter data points with high-quality reasoning knowledge. This framework aims to provide highly matched table, description and reasoning knowledge triplets for LT2T. The results obtained on LogicNLG database demonstrate that the efficiencies of the method in this paper has achieved optimal performance with a reduced amount of data. Specifically, it enhances SP-Acc by 1.4 points and NLI-Acc by 0.7 points compared to the current state-of-the-art model.

pdf bib
From Chain to Tree: Refining Chain-like Rules into Tree-like Rules on Knowledge Graphs
Wangtao Sun | Shizhu He | Jun Zhao | Kang Liu

With good explainability and controllability, rule-based methods play an important role in the task of Knowledge Graph Completion (KGC). However, existing studies primarily focused on learning chain-like rules, whose chain-like structure limits their expressive power. Consequently, chain-like rules often exhibit lower Standard Confidence, and are prone to the incorrect grounding values during reasoning, thus producing erroneous reasoning results. In this paper, we propose the concept of tree-like rules on knowledge graphs to expand the scope of the application and improve the reasoning ability of rule-based methods. To achieve this, we formalize the problem of tree-like rule refinement and propose an effective framework for refining chain-like rules into tree-like rules. Experimental evaluations on four public datasets demonstrate that the proposed framework can seamlessly adapt to various chain-like rule induction methods and the refined tree-like rules consistently exhibit higher Standard Confidence and achieve better performances than the original chain-like rules on link prediction tasks. Furthermore, we illustrate that the improvements brought by tree-like rules are positively correlated with the density of the knowledge graphs. The data and code of this paper can be available at https://github.com/forangel2014/tree-rule.

pdf bib
LAB-KG: A Retrieval-Augmented Generation Method with Knowledge Graphs for Medical Lab Test Interpretation
Rui Guo | Barry Devereux | Greg Farnan | Niall McLaughlin

Laboratory tests generate structured numerical data, which a clinician must interpret to justify diagnoses and help patients understand the outcomes of the tests. LLMs have the potential to assist with the generation of interpretative comments, but legitimate concerns remain about the accuracy and reliability of the generation process. This work introduces LAB-KG, which conditions the generation process of an LLM on information retrieved from a knowledge graph of relevant patient conditions and lab test results. This helps to ground the text-generation process in accurate medical knowledge and enables generated text to be traced back to the knowledge graph. Given a dataset of laboratory test results and associated interpretive comments, we show how an LLM can build a KG of the relationships between laboratory test results, reference ranges, patient conditions and demographic information. We further show that the interpretive comments produced by an LLM conditioned on information retrieved from the KG are of higher quality than those from a standard RAG method. Finally, we show how our KG approach can improve the interpretability of the LLM generated text.

pdf bib
Bridging Language and Scenes through Explicit 3-D Model Construction
Tiansi Dong | Writwick Das | Rafet Sifa

We introduce the methodology of explicit model construction to bridge linguistic descriptions and scene perception and demonstrate that in Visual Question-Answering (VQA) using MC4VQA (Model Construction for Visual Question-Answering), a method developed by us. Given a question about a scene, our MC4VQA first recognizes objects utilizing pre-trained deep learning systems. Then, it constructs an explicit 3-D layout by repeatedly reducing the difference between the input scene image and the image rendered from the current 3-D spatial environment. This novel “iterative rendering” process endows MC4VQA the capability of acquiring spatial attributes without training data. MC4VQA outperforms NS-VQA (the SOTA system) by reaching 99.94% accuracy on the benchmark CLEVR datasets, and is more robust than NS-VQA on new testing datasets. With newly created testing data, NS-VQA’s performance dropped to 97.60%, while MC4VQA still kept the 99.0% accuracy. This work sets a new SOTA performance of VQA on the benchmark CLEVR datasets, and shapes a new method that may solve the out-of-distribution problem.

pdf bib
VCRMNER: Visual Cue Refinement in Multimodal NER using CLIP Prompts
Yu Bai | Lianji Wang | Xiang Liu | Haifeng Chi | Guiping Zhang

With the continuous growth of multi-modal data on social media platforms, traditional Named Entity Recognition has rendered insufficient for handling contemporary data formats. Consequently, researchers proposed Multi-modal Named Entity Recognition (MNER). Existing studies focus on capturing the visual regions corresponding to entities to assist in entity recognition. However, these approaches still struggle to mitigate interference from visual regions that are irrelevant to the entities. To address this issue, we propose an innovative framework, Visual Cue Refinement in MNER(VCRMNER) using CLIP Prompts, to accurately capture visual cues (object-level visual regions) associated with entities. We leverage prompts to represent the semantic information of entity categories, which helps us assess visual cues and minimize interference from those irrelevant to the entities. Furthermore, we designed an interaction transformer that operates in two stages—first within each modality and then between modalities—to refine visual cues by learning from a frozen image encoder, thereby reducing differences between text and visual modalities. Comprehensive experiments were conducted on two public datasets, Twitter15 and Twitter17. The results and detailed analyses demonstrate that our method exhibits robust and competitive performance.

pdf bib
Neuro-Conceptual Artificial Intelligence: Integrating OPM with Deep Learning to Enhance Question Answering Quality
Xin Kang | Veronika Shteyngardt | Yuhan Wang | Dov Dori

Knowledge representation and reasoning are critical challenges in Artificial Intelligence (AI), particularly in integrating neural and symbolic approaches to achieve explainable and transparent AI systems. Traditional knowledge representation methods often fall short of capturing complex processes and state changes. We introduce Neuro-Conceptual Artificial Intelligence (NCAI), a specialization of the neuro-symbolic AI approach that integrates conceptual modeling using Object-Process Methodology (OPM) ISO 19450:2024 with deep learning to enhance question-answering (QA) quality. By converting natural language text into OPM models using in-context learning, NCAI leverages the expressive power of OPM to represent complex OPM elements—processes, objects, and states—beyond what traditional triplet-based knowledge graphs can easily capture. This rich structured knowledge representation improves reasoning transparency and answer accuracy in an OPM-QA system. We further propose transparency evaluation metrics to quantitatively measure how faithfully the predicted reasoning aligns with OPM-based conceptual logic. Our experiments demonstrate that NCAI outperforms traditional methods, highlighting its potential for advancing neuro-symbolic AI by providing rich knowledge representations, measurable transparency, and improved reasoning.

pdf bib
Emergence of symbolic abstraction heads for in-context learning in large language models
Ali Al-Saeedi | Aki Harma

Large Language Models (LLMs) based on self-attention circuits are able to perform, at inference time, novel reasoning tasks, but the mechanisms inside the models are currently not fully understood. We assume that LLMs are able to generalize abstract patterns from the input and form an internal symbolic internal representation of the content. In this paper, we study this by analyzing the performance of small LLM models trained with sequences of instantiations of abstract sequential symbolic patterns or templates. It is shown that even a model with two layers is able to learn an abstract template and use it to generate correct output representing the pattern. This can be seen as a form of symbolic inference taking place inside the network. In this paper, we call the emergent mechanism abstraction head. Identifying mechanisms of symbolic reasoning in a neural network can help to find new ways to merge symbolic and neural processing.

pdf bib
Linking language model predictions to human behaviour on scalar implicatures
Yulia Zinova | David Arps | Katharina Spalek | Jacopo Romoli

We explore the behaviour of language models on adjectival scales in connection with negation when prompted with material used in human experiments. We propose several metrics extracted from the model predictions and analyze those metrics in relation to human data as well as use them to propose new items to be tested in human experiments.

pdf bib
Generative FrameNet: Scalable and Adaptive Frames for Interpretable Knowledge Storage and Retrieval for LLMs Powered by LLMs
Harish Tayyar Madabushi | Taylor Hudson | Claire Bonial

Frame semantics provides an explanation for how we make use of conceptual frames, which encapsulate background knowledge and associations, to more completely understand the meanings of words within a context. Unfortunately, FrameNet, the only widely available implementation of frame semantics, is limited in both scale and coverage. Therefore, we introduce a novel mechanism for generating task-specific frames using large language models (LLMs), which we call Generative FrameNet. We demonstrate its effectiveness on a task that is highly relevant in the current landscape of LLMs: the interpretable storage and retrieval of factual information. Specifically, Generative Frames enable the extension of Retrieval-Augmented Generation (RAG), providing an interpretable framework for reducing inaccuracies in LLMs. We conduct experiments to demonstrate the effectiveness of this method both in terms of retrieval effectiveness as well as the relevance of the automatically generated frames and frame relations. Expert analysis shows that Generative Frames capture a more suitable level of semantic specificity than the frames from FrameNet. Thus, Generative Frames capture a notion of frame semantics that is closer to Fillmore’s originally intended definition, and offer potential for providing data-driven insights into Frame Semantics theory. Our results also show that this novel mechanism of Frame Semantic-based interpretable retrieval improves RAG for question answering with LLMs—outperforming a GPT-4 based baseline by up to 8 points. We provide open access to our data, including prompts and Generative FrameNet.