Exploring the Value of Multi-View Learning for Session-Aware Query Representation
Diego Ortiz | Jose Moreno | Gilles Hubert | Karen Pinel-Sauvagnat | Lynda Tamine
Findings of the Association for Computational Linguistics: NAACL 2022
Recent years have witnessed a growing interest towards learning distributed query representations that are able to capture search intent semantics. Most existing approaches learn query embeddings using relevance supervision making them suited only to document ranking tasks. Besides, they generally consider either user’s query reformulations or system’s rankings whereas previous findings show that user’s query behavior and knowledge change depending on the system’s results, intertwine and affect each other during the completion of a search task. In this paper, we explore the value of multi-view learning for generic and unsupervised session-aware query representation learning. First, single-view query embeddings are obtained in separate spaces from query reformulations and document ranking representations using transformers. Then, we investigate the use of linear (CCA) and non linear (UMAP) multi-view learning methods, to align those spaces with the aim of revealing similarity traits in the multi-view shared space. Experimental evaluation is carried out in a query classification and session-based retrieval downstream tasks using respectively the KDD and TREC session datasets. The results show that multi-view learning is an effective and controllable approach for unsupervised learning of generic query representations and can reflect search behavior patterns.
Rouletabille at SemEval-2019 Task 4: Neural Network Baseline for Identification of Hyperpartisan Publishers
Jose G. Moreno | Yoann Pitarch | Karen Pinel-Sauvagnat | Gilles Hubert
Proceedings of the 13th International Workshop on Semantic Evaluation
This paper describes the Rouletabille participation to the Hyperpartisan News Detection task. We propose the use of different text classification methods for this task. Preliminary experiments using a similar collection used in (Potthast et al., 2018) show that neural-based classification methods reach state-of-the art results. Our final submission is composed of a unique run that ranks among all runs at 3/49 position for the by-publisher test dataset and 43/96 for the by-article test dataset in terms of Accuracy.