Sahar Vahdati


2024

pdf bib
Beyond Boundaries: A Human-like Approach for Question Answering over Structured and Unstructured Information Sources
Jens Lehmann | Dhananjay Bhandiwad | Preetam Gattogi | Sahar Vahdati
Transactions of the Association for Computational Linguistics, Volume 12

Answering factual questions from heterogenous sources, such as graphs and text, is a key capacity of intelligent systems. Current approaches either (i) perform question answering over text and structured sources as separate pipelines followed by a merge step or (ii) provide an early integration, giving up the strengths of particular information sources. To solve this problem, we present “HumanIQ”, a method that teaches language models to dynamically combine retrieved information by imitating how humans use retrieval tools. Our approach couples a generic method for gathering human demonstrations of tool use with adaptive few-shot learning for tool augmented models. We show that HumanIQ confers significant benefits, including i) reducing the error rate of our strongest baseline (GPT-4) by over 50% across 3 benchmarks, (ii) improving human preference over responses from vanilla GPT-4 (45.3% wins, 46.7% ties, 8.0% loss), and (iii) outperforming numerous task-specific baselines.

pdf bib
Knowledge GeoGebra: Leveraging Geometry of Relation Embeddings in Knowledge Graph Completion
Kossi Amouzouvi | Bowen Song | Sahar Vahdati | Jens Lehmann
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Knowledge graph embedding (KGE) models provide a low-dimensional representation of knowledge graphs in continuous vector spaces. This representation learning enables different downstream AI tasks such as link prediction for graph completion. However, most embedding models are only designed considering the algebra and geometry of the entity embedding space, the algebra of the relation embedding space, and the interaction between relation and entity embeddings. Neglecting the geometry of relation embedding limits the optimization of entity and relation distribution leading to suboptimal performance of knowledge graph completion. To address this issue, we propose a new perspective in the design of KGEs by looking into the geometry of relation embedding space. The proposed method and its variants are developed on top of an existing framework, RotatE, from which we leverage the geometry of the relation embeddings by mutating the unit circle to an ellipse, and further generalize it with the concept of a butterfly curve, consecutively. Besides the theoretical abilities of the model in preserving topological and relational patterns, the experiments on the WN18RR, FB15K-237 and YouTube benchmarks showed that this new family of KGEs can challenge or outperform state-of-the-art models.

2021

pdf bib
Knowledge Graph Representation Learning using Ordinary Differential Equations
Mojtaba Nayyeri | Chengjin Xu | Franca Hoffmann | Mirza Mohtashim Alam | Jens Lehmann | Sahar Vahdati
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Knowledge Graph Embeddings (KGEs) have shown promising performance on link prediction tasks by mapping the entities and relations from a knowledge graph into a geometric space. The capability of KGEs in preserving graph characteristics including structural aspects and semantics, highly depends on the design of their score function, as well as the inherited abilities from the underlying geometry. Many KGEs use the Euclidean geometry which renders them incapable of preserving complex structures and consequently causes wrong inferences by the models. To address this problem, we propose a neuro differential KGE that embeds nodes of a KG on the trajectories of Ordinary Differential Equations (ODEs). To this end, we represent each relation (edge) in a KG as a vector field on several manifolds. We specifically parameterize ODEs by a neural network to represent complex manifolds and complex vector fields on the manifolds. Therefore, the underlying embedding space is capable to assume the shape of various geometric forms to encode heterogeneous subgraphs. Experiments on synthetic and benchmark datasets using state-of-the-art KGE models justify the ODE trajectories as a means to enable structure preservation and consequently avoiding wrong inferences.