Zied Bouraoui


2024

pdf bib
Modelling Commonsense Commonalities with Multi-Facet Concept Embeddings
Hanane Kteich | Na Li | Usashi Chatterjee | Zied Bouraoui | Steven Schockaert
Findings of the Association for Computational Linguistics: ACL 2024

Concept embeddings offer a practical and efficient mechanism for injecting commonsense knowledge into downstream tasks. Their core purpose is often not to predict the commonsense properties of concepts themselves, but rather to identify commonalities, i.e. sets of concepts which share some property of interest. Such commonalities are the basis for inductive generalisation, hence high-quality concept embeddings can make learning easier and more robust. Unfortunately, standard embeddings primarily reflect basic taxonomic categories, making them unsuitable for finding commonalities that refer to more specific aspects (e.g. the colour of objects or the materials they are made of). In this paper, we address this limitation by explicitly modelling the different facets of interest when learning concept embeddings. We show that this leads to embeddings which capture a more diverse range of commonsense properties, and consistently improves results in downstream tasks such as ultra-fine entity typing and ontology completion.

pdf bib
CONTOR: Benchmarking Strategies for Completing Ontologies with Plausible Missing Rules
Na Li | Thomas Bailleux | Zied Bouraoui | Steven Schockaert
Findings of the Association for Computational Linguistics: EMNLP 2024

We consider the problem of finding plausible rules that are missing from a given ontology. A number of strategies for this problem have already been considered in the literature. Little is known about the relative performance of these strategies, however, as they have thus far been evaluated on different ontologies. Moreover, existing evaluations have focused on distinguishing held-out ontology rules from randomly corrupted ones, which often makes the task unrealistically easy and leads to the presence of incorrectly labelled negative examples. To address these concerns, we introduce a benchmark with manually annotated hard negatives and use this benchmark to evaluate ontology completion models. In addition to previously proposed models, we test the effectiveness of several approaches that have not yet been considered for this task, including LLMs and simple but effective hybrid strategies.

pdf bib
AMenDeD: Modelling Concepts by Aligning Mentions, Definitions and Decontextualised Embeddings
Amit Gajbhiye | Zied Bouraoui | Luis Espinosa Anke | Steven Schockaert
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Contextualised Language Models (LM) improve on traditional word embeddings by encoding the meaning of words in context. However, such models have also made it possible to learn high-quality decontextualised concept embeddings. Three main strategies for learning such embeddings have thus far been considered: (i) fine-tuning the LM to directly predict concept embeddings from the name of the concept itself, (ii) averaging contextualised representations of mentions of the concept in a corpus, and (iii) encoding definitions of the concept. As these strategies have complementary strengths and weaknesses, we propose to learn a unified embedding space in which all three types of representations can be integrated. We show that this allows us to outperform existing approaches in tasks such as ontology completion, which heavily depends on access to high-quality concept embeddings. We furthermore find that mentions and definitions are well-aligned in the resulting space, enabling tasks such as target sense verification, even without the need for any fine-tuning.

pdf bib
Can Language Models Learn Embeddings of Propositional Logic Assertions?
Nurul Fajrin Ariyani | Zied Bouraoui | Richard Booth | Steven Schockaert
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Natural language offers an appealing alternative to formal logics as a vehicle for representing knowledge. However, using natural language means that standard methods for automated reasoning can no longer be used. A popular solution is to use transformer-based language models (LMs) to directly reason about knowledge expressed in natural language, but this has two important limitations. First, the set of premises is often too large to be directly processed by the LM. This means that we need a retrieval strategy which can select the most relevant premises when trying to infer some conclusion. Second, LMs have been found to learn shortcuts and thus lack robustness, putting in doubt to what extent they actually understand the knowledge that is expressed. Given these limitations, we explore the following alternative: rather than using LMs to perform reasoning directly, we use them to learn embeddings of individual assertions. Reasoning is then carried out by manipulating the learned embeddings. We show that this strategy is feasible to some extent, while at the same time also highlighting the limitations of directly fine-tuning LMs to learn the required embeddings.

2023

pdf bib
Ultra-Fine Entity Typing with Prior Knowledge about Labels: A Simple Clustering Based Strategy
Na Li | Zied Bouraoui | Steven Schockaert
Findings of the Association for Computational Linguistics: EMNLP 2023

Ultra-fine entity typing (UFET) is the task of inferring the semantic types from a large set of fine-grained candidates that apply to a given entity mention. This task is especially challenging because we only have a small number of training examples for many types, even with distant supervision strategies. State-of-the-art models, therefore, have to rely on prior knowledge about the type labels in some way. In this paper, we show that the performance of existing methods can be improved using a simple technique: we use pre-trained label embeddings to cluster the labels into semantic domains and then treat these domains as additional types. We show that this strategy consistently leads to improved results as long as high-quality label embeddings are used. Furthermore, we use the label clusters as part of a simple post-processing technique, which results in further performance gains. Both strategies treat the UFET model as a black box and can thus straightforwardly be used to improve a wide range of existing models.

pdf bib
What do Deck Chairs and Sun Hats Have in Common? Uncovering Shared Properties in Large Concept Vocabularies
Amit Gajbhiye | Zied Bouraoui | Na Li | Usashi Chatterjee | Luis Espinosa-Anke | Steven Schockaert
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Concepts play a central role in many applications. This includes settings where concepts have to be modelled in the absence of sentence context. Previous work has therefore focused on distilling decontextualised concept embeddings from language models. But concepts can be modelled from different perspectives, whereas concept embeddings typically mostly capture taxonomic structure. To address this issue, we propose a strategy for identifying what different concepts, from a potentially large concept vocabulary, have in common with others. We then represent concepts in terms of the properties they share with the other concepts. To demonstrate the practical usefulness of this way of modelling concepts, we consider the task of ultra-fine entity typing, which is a challenging multi-label classification problem. We show that by augmenting the label set with shared properties, we can improve the performance of the state-of-the-art models for this task.

2022

pdf bib
Sentence Selection Strategies for Distilling Word Embeddings from BERT
Yixiao Wang | Zied Bouraoui | Luis Espinosa Anke | Steven Schockaert
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Many applications crucially rely on the availability of high-quality word vectors. To learn such representations, several strategies based on language models have been proposed in recent years. While effective, these methods typically rely on a large number of contextualised vectors for each word, which makes them impractical. In this paper, we investigate whether similar results can be obtained when only a few contextualised representations of each word can be used. To this end, we analyse a range of strategies for selecting the most informative sentences. Our results show that with a careful selection strategy, high-quality word vectors can be learned from as few as 5 to 10 sentences.

2021

pdf bib
Deriving Word Vectors from Contextualized Language Models using Topic-Aware Mention Selection
Yixiao Wang | Zied Bouraoui | Luis Espinosa Anke | Steven Schockaert
Proceedings of the 6th Workshop on Representation Learning for NLP (RepL4NLP-2021)

One of the long-standing challenges in lexical semantics consists in learning representations of words which reflect their semantic properties. The remarkable success of word embeddings for this purpose suggests that high-quality representations can be obtained by summarizing the sentence contexts of word mentions. In this paper, we propose a method for learning word representations that follows this basic strategy, but differs from standard word embeddings in two important ways. First, we take advantage of contextualized language models (CLMs) rather than bags of word vectors to encode contexts. Second, rather than learning a word vector directly, we use a topic model to partition the contexts in which words appear, and then learn different topic-specific vectors for each word. Finally, we use a task-specific supervision signal to make a soft selection of the resulting vectors. We show that this simple strategy leads to high-quality word vectors, which are more predictive of semantic properties than word embeddings and existing CLM-based strategies.

2020

pdf bib
A Mixture-of-Experts Model for Learning Multi-Facet Entity Embeddings
Rana Alshaikh | Zied Bouraoui | Shelan Jeawak | Steven Schockaert
Proceedings of the 28th International Conference on Computational Linguistics

Various methods have already been proposed for learning entity embeddings from text descriptions. Such embeddings are commonly used for inferring properties of entities, for recommendation and entity-oriented search, and for injecting background knowledge into neural architectures, among others. Entity embeddings essentially serve as a compact encoding of a similarity relation, but similarity is an inherently multi-faceted notion. By representing entities as single vectors, existing methods leave it to downstream applications to identify these different facets, and to select the most relevant ones. In this paper, we propose a model that instead learns several vectors for each entity, each of which intuitively captures a different aspect of the considered domain. We use a mixture-of-experts formulation to jointly learn these facet-specific embeddings. The individual entity embeddings are learned using a variant of the GloVe model, which has the advantage that we can easily identify which properties are modelled well in which of the learned embeddings. This is exploited by an associated gating network, which uses pre-trained word vectors to encourage the properties that are modelled by a given embedding to be semantically coherent, i.e. to encourage each of the individual embeddings to capture a meaningful facet.

2019

pdf bib
Learning Conceptual Spaces with Disentangled Facets
Rana Alshaikh | Zied Bouraoui | Steven Schockaert
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

Conceptual spaces are geometric representations of meaning that were proposed by G ̈ardenfors (2000). They share many similarities with the vector space embeddings that are commonly used in natural language processing. However, rather than representing entities in a single vector space, conceptual spaces are usually decomposed into several facets, each of which is then modelled as a relatively low dimensional vector space. Unfortunately, the problem of learning such conceptual spaces has thus far only received limited attention. To address this gap, we analyze how, and to what extent, a given vector space embedding can be decomposed into meaningful facets in an unsupervised fashion. While this problem is highly challenging, we show that useful facets can be discovered by relying on word embeddings to group semantically related features.

2018

pdf bib
Unsupervised Learning of Distributional Relation Vectors
Shoaib Jameel | Zied Bouraoui | Steven Schockaert
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Word embedding models such as GloVe rely on co-occurrence statistics to learn vector representations of word meaning. While we may similarly expect that co-occurrence statistics can be used to capture rich information about the relationships between different words, existing approaches for modeling such relationships are based on manipulating pre-trained word vectors. In this paper, we introduce a novel method which directly learns relation vectors from co-occurrence statistics. To this end, we first introduce a variant of GloVe, in which there is an explicit connection between word vectors and PMI weighted co-occurrence vectors. We then show how relation vectors can be naturally embedded into the resulting vector space.

pdf bib
Relation Induction in Word Embeddings Revisited
Zied Bouraoui | Shoaib Jameel | Steven Schockaert
Proceedings of the 27th International Conference on Computational Linguistics

Given a set of instances of some relation, the relation induction task is to predict which other word pairs are likely to be related in the same way. While it is natural to use word embeddings for this task, standard approaches based on vector translations turn out to perform poorly. To address this issue, we propose two probabilistic relation induction models. The first model is based on translations, but uses Gaussians to explicitly model the variability of these translations and to encode soft constraints on the source and target words that may be chosen. In the second model, we use Bayesian linear regression to encode the assumption that there is a linear relationship between the vector representations of related words, which is considerably weaker than the assumption underlying translation based models.