2024
pdf
bib
Evaluating D-MERIT of Partial-annotation on Information Retrieval
Royi Rassin
|
Yaron Fairstein
|
Oren Kalinsky
|
Guy Kushilevitz
|
Nachshon Cohen
|
Alexander Libov
|
Yoav Goldberg
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
pdf
bib
abs
Class Balancing for Efficient Active Learning in Imbalanced Datasets
Yaron Fairstein
|
Oren Kalinsky
|
Zohar Karnin
|
Guy Kushilevitz
|
Alexander Libov
|
Sofia Tolmach
Proceedings of The 18th Linguistic Annotation Workshop (LAW-XVIII)
Recent developments in active learning algorithms for NLP tasks show promising results in terms of reducing labelling complexity. In this paper we extend this effort to imbalanced datasets; we bridge between the active learning approach of obtaining diverse andinformative examples, and the heuristic of class balancing used in imbalanced datasets. We develop a novel tune-free weighting technique that canbe applied to various existing active learning algorithms, adding a component of class balancing. We compare several active learning algorithms to their modified version on multiple public datasetsand show that when the classes are imbalanced, with manual annotation effort remaining equal the modified version significantly outperforms the original both in terms of the test metric and the number of obtained minority examples. Moreover, when the imbalance is mild or non-existent (classes are completely balanced), our technique does not harm the base algorithms.
pdf
bib
abs
Towards Translating Objective Product Attributes Into Customer Language
Ram Yazdi
|
Oren Kalinsky
|
Alexander Libov
|
Dafna Shahaf
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 6: Industry Track)
When customers search online for a product they are not familiar with, their needs are often expressed through subjective product attributes, such as ”picture quality” for a TV or ”easy to clean” for a sofa. In contrast, the product catalog in online stores includes objective attributes such as ”screen resolution” or ”material”. In this work, we aim to find a link between the objective product catalog and the subjective needs of the customers, to help customers better understand the product space using their own words. We apply correlation-based methods to the store’s product catalog and product reviews in order to find the best potential links between objective and subjective attributes; next, Large Language Models (LLMs) reduce spurious correlations by incorporating common sense and world knowledge (e.g., picture quality is indeed affected by screen resolution, and 8k is the best one). We curate a dataset for this task and show that our combined approach outperforms correlation-only and causation-only approaches.
2023
pdf
bib
abs
Simple and Effective Multi-Token Completion from Masked Language Models
Oren Kalinsky
|
Guy Kushilevitz
|
Alexander Libov
|
Yoav Goldberg
Findings of the Association for Computational Linguistics: EACL 2023
Pre-trained neural masked language models are often used for predicting a replacement token for a given sequence position, in a cloze-like task. However, this usage is restricted to predicting a single token, from a relatively small pre-trained vocabulary. Recent Sequence2Sequence pre-trained LMs like T5 do allow predicting multi-token completions, but are more expensive to train and run. We show that pre-trained masked language models can be adapted to produce multi-token completions, with only a modest addition to their parameter count. We propose two simple adaptation approaches, trading parameter counts for accuracy. The first method generates multi-token completions from a conditioned RNN. It has a very low parameter count and achieves competitive results. The second method is even simpler: it adds items corresponding to multi-token units to the output prediction matrix. While being higher in parameter count than the RNN method, it also surpasses current state-of-the-art multi-token completion models, including T5-3B, while being significantly more parameter efficient. We demonstrate that our approach is flexible to different vocabularies and domains and can effectively leverage existing pre-trained models available in different domains. Finally, a human evaluation further validates our results and shows that our solution regularly provides valid completions, as well as reasonable correctness for factual-sentence completions.
2021
pdf
bib
abs
WikiSum: Coherent Summarization Dataset for Efficient Human-Evaluation
Nachshon Cohen
|
Oren Kalinsky
|
Yftah Ziser
|
Alessandro Moschitti
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)
Recent works made significant advances on summarization tasks, facilitated by summarization datasets. Several existing datasets have the form of coherent-paragraph summaries. However, these datasets were curated from academic documents that were written for experts, thus making the essential step of assessing the summarization output through human-evaluation very demanding. To overcome these limitations, we present a dataset based on article summaries appearing on the WikiHow website, composed of how-to articles and coherent-paragraph summaries written in plain language. We compare our dataset attributes to existing ones, including readability and world-knowledge, showing our dataset makes human evaluation significantly easier and thus, more effective. A human evaluation conducted on PubMed and the proposed dataset reinforces our findings.