Jakub Zavrel

2022

Multi-objective Representation Learning for Scientific Document Retrieval
Mathias Parisot | Jakub Zavrel
Proceedings of the Third Workshop on Scholarly Document Processing

Existing dense retrieval models for scientific documents have been optimized for either retrieval by short queries, or for document similarity, but usually not for both. In this paper, we explore the space of combining multiple objectives to achieve a single representation model that presents a good balance between both modes of dense retrieval, combining the relevance judgements from MS MARCO with the citation similarity of SPECTER, and the self-supervised objective of independent cropping. We also consider the addition of training data from document co-citation in a sentence context and domain-specific synthetic data. We show that combining multiple objectives yields models that generalize well across different benchmark tasks, improving up to 73% over models trained on a single objective.

2020

pdf bib abs

A New Neural Search and Insights Platform for Navigating and Organizing AI Research
Marzieh Fadaee | Olga Gureenkova | Fernando Rejon Barrera | Carsten Schnober | Wouter Weerkamp | Jakub Zavrel
Proceedings of the First Workshop on Scholarly Document Processing

To provide AI researchers with modern tools for dealing with the explosive growth of the research literature in their field, we introduce a new platform, AI Research Navigator, that combines classical keyword search with neural retrieval to discover and organize relevant literature. The system provides search at multiple levels of textual granularity, from sentences to aggregations across documents, both in natural language and through navigation in a domain specific Knowledge Graph. We give an overview of the overall architecture of the system and of the components for document analysis, question answering, search, analytics, expert search, and recommendations.

pdf bib abs

Effective distributed representations for academic expert search
Mark Berger | Jakub Zavrel | Paul Groth
Proceedings of the First Workshop on Scholarly Document Processing

Expert search aims to find and rank experts based on a user’s query. In academia, retrieving experts is an efficient way to navigate through a large amount of academic knowledge. Here, we study how different distributed representations of academic papers (i.e. embeddings) impact academic expert retrieval. We use the Microsoft Academic Graph dataset and experiment with different configurations of a document-centric voting model for retrieval. In particular, we explore the impact of the use of contextualized embeddings on search performance. We also present results for paper embeddings that incorporate citation information through retrofitting. Additionally, experiments are conducted using different techniques for assigning author weights based on author order. We observe that using contextual embeddings produced by a transformer model trained for sentence similarity tasks produces the most effective paper representations for document-centric expert retrieval. However, retrofitting the paper embeddings and using elaborate author contribution weighting strategies did not improve retrieval performance.

2007

pdf bib

Learning to Compose Effective Strategies from a Library of Dialogue Components
Martijn Spitters | Marco De Boni | Jakub Zavrel | Remko Bonnema
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

2001

pdf bib

Improving Accuracy in word class tagging through the Combination of Machine Learning Systems
Hans Van Halteren | Jakub Zavrel | Walter Daelemans
Computational Linguistics, Volume 27, Number 2, June 2001

Jakub Zavrel

2022

2020

2007

2001

2000

1998

1997

1996

Co-authors

Venues