Guy Aglionby


2022

pdf bib
Identifying relevant common sense information in knowledge graphs
Guy Aglionby | Simone Teufel
Proceedings of the First Workshop on Commonsense Representation and Reasoning (CSRR 2022)

Knowledge graphs are often used to store common sense information that is useful for various tasks. However, the extraction of contextually-relevant knowledge is an unsolved problem, and current approaches are relatively simple. Here we introduce a triple selection method based on a ranking model and find that it improves question answering accuracy over existing methods. We additionally investigate methods to ensure that extracted triples form a connected graph. Graph connectivity is important for model interpretability, as paths are frequently used as explanations for the reasoning that connects question and answer.

pdf bib
Faithful Knowledge Graph Explanations in Commonsense Question Answering
Guy Aglionby | Simone Teufel
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Knowledge graphs are commonly used as sources of information in commonsense question answering, and can also be used to express explanations for the model’s answer choice. A common way of incorporating facts from the graph is to encode them separately from the question, and then combine the two representations to select an answer. In this paper, we argue that highly faithful graph-based explanations cannot be extracted from existing models of this type. Such explanations will not include reasoning done by the transformer encoding the question, so will be incomplete. We confirm this theory with a novel proxy measure for faithfulness and propose two architecture changes to address the problem. Our findings suggest a path forward for developing architectures for faithful graph-based explanations.

2019

pdf bib
CAMsterdam at SemEval-2019 Task 6: Neural and graph-based feature extraction for the identification of offensive tweets
Guy Aglionby | Chris Davis | Pushkar Mishra | Andrew Caines | Helen Yannakoudakis | Marek Rei | Ekaterina Shutova | Paula Buttery
Proceedings of the 13th International Workshop on Semantic Evaluation

We describe the CAMsterdam team entry to the SemEval-2019 Shared Task 6 on offensive language identification in Twitter data. Our proposed model learns to extract textual features using a multi-layer recurrent network, and then performs text classification using gradient-boosted decision trees (GBDT). A self-attention architecture enables the model to focus on the most relevant areas in the text. In order to enrich input representations, we use node2vec to learn globally optimised embeddings for hashtags, which are then given as additional features to the GBDT classifier. Our best model obtains 78.79% macro F1-score on detecting offensive language (subtask A), 66.32% on categorising offence types (targeted/untargeted; subtask B), and 55.36% on identifying the target of offence (subtask C).