Jon Atle Gulla


2024

pdf bib
NLEBench+NorGLM: A Comprehensive Empirical Analysis and Benchmark Dataset for Generative Language Models in Norwegian
Peng Liu | Lemei Zhang | Terje Farup | Even W. Lauvrak | Jon Espen Ingvaldsen | Simen Eide | Jon Atle Gulla | Zhirong Yang
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Norwegian, spoken by only 5 million population, is under-representative within the most impressive breakthroughs in NLP tasks. To the best of our knowledge, there has not yet been a comprehensive evaluation of the existing language models (LMs) on Norwegian generation tasks during the article writing process. To fill this gap, we 1) compiled the existing Norwegian dataset and pre-trained 4 Norwegian Open Language Models varied from parameter scales and architectures, collectively called NorGLM; 2) introduced a comprehensive benchmark, NLEBench, for evaluating natural language generation capabilities in Norwegian, encompassing translation and human annotation. Based on the investigation, we find that: 1) the mainstream, English-dominated LM GPT-3.5 has limited capability in understanding the Norwegian context; 2) the increase in model parameter scales demonstrates limited impact on the performance of downstream tasks when the pre-training dataset is constrained in size; 3) smaller models also demonstrate the reasoning capability through Chain-of-Thought; 4) a multi-task dataset that includes synergy tasks can be used to verify the generalizability of LLMs on natural language understanding and, meanwhile, test the interconnectedness of these NLP tasks. We share our resources and code for reproducibility under a CC BY-NC 4.0 license.

2023

pdf bib
Pre-train, Prompt, and Recommendation: A Comprehensive Survey of Language Modeling Paradigm Adaptations in Recommender Systems
Peng Liu | Lemei Zhang | Jon Atle Gulla
Transactions of the Association for Computational Linguistics, Volume 11

The emergence of Pre-trained Language Models (PLMs) has achieved tremendous success in the field of Natural Language Processing (NLP) by learning universal representations on large corpora in a self-supervised manner. The pre-trained models and the learned representations can be beneficial to a series of downstream NLP tasks. This training paradigm has recently been adapted to the recommendation domain and is considered a promising approach by both academia and industry. In this paper, we systematically investigate how to extract and transfer knowledge from pre-trained models learned by different PLM-related training paradigms to improve recommendation performance from various perspectives, such as generality, sparsity, efficiency and effectiveness. Specifically, we propose a comprehensive taxonomy to divide existing PLM-based recommender systems w.r.t. their training strategies and objectives. Then, we analyze and summarize the connection between PLM-based training paradigms and different input data types for recommender systems. Finally, we elaborate on open issues and future research directions in this vibrant field.

2022

pdf bib
Balancing Multi-Domain Corpora Learning for Open-Domain Response Generation
Yujie Xing | Jinglun Cai | Nils Barlaug | Peng Liu | Jon Atle Gulla
Findings of the Association for Computational Linguistics: NAACL 2022

Open-domain conversational systems are assumed to generate equally good responses on multiple domains. Previous work achieved good performance on the single corpus, but training and evaluating on multiple corpora from different domains are less studied. This paper explores methods of generating relevant responses for each of multiple multi-domain corpora. We first examine interleaved learning which intermingles multiple corpora as the baseline. We then investigate two multi-domain learning methods, labeled learning and multi-task labeled learning, which encode each corpus through a unique corpus embedding. Furthermore, we propose Domain-specific Frequency (DF), a novel word-level importance weight that measures the relative importance of a word for a specific corpus compared to other corpora. Based on DF, we propose weighted learning, a method that integrates DF to the loss function. We also adopt DF as a new evaluation metric. Extensive experiments show that our methods gain significant improvements on both automatic and human evaluation. We share our code and data for reproducibility.

pdf bib
Building Sentiment Lexicons for Mainland Scandinavian Languages Using Machine Translation and Sentence Embeddings
Peng Liu | Cristina Marco | Jon Atle Gulla
Proceedings of the Thirteenth Language Resources and Evaluation Conference

This paper presents a simple but effective method to build sentiment lexicons for the three Mainland Scandinavian languages: Danish, Norwegian and Swedish. This method benefits from the English Sentiwordnet and a thesaurus in one of the target languages. Sentiment information from the English resource is mapped to the target languages by using machine translation and similarity measures based on sentence embeddings. A number of experiments with Scandinavian languages are performed in order to determine the best working sentence embedding algorithm for this task. A careful extrinsic evaluation on several datasets yields state-of-the-art results using a simple rule-based sentiment analysis algorithm. The resources are made freely available under an MIT License.

2016

pdf bib
Political News Sentiment Analysis for Under-resourced Languages
Patrik F. Bakken | Terje A. Bratlie | Cristina Marco | Jon Atle Gulla
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

This paper presents classification results for the analysis of sentiment in political news articles. The domain of political news is particularly challenging, as journalists are presumably objective, whilst at the same time opinions can be subtly expressed. To deal with this challenge, in this work we conduct a two-step classification model, distinguishing first subjective and second positive and negative sentiment texts. More specifically, we propose a shallow machine learning approach where only minimal features are needed to train the classifier, including sentiment-bearing Co-Occurring Terms (COTs) and negation words. This approach yields close to state-of-the-art results. Contrary to results in other domains, the use of negations as features does not have a positive impact in the evaluation results. This method is particularly suited for languages that suffer from a lack of resources, such as sentiment lexicons or parsers, and for those systems that need to function in real-time.

2011

pdf bib
Enhancing the HL-SOT Approach to Sentiment Analysis via a Localized Feature Selection Framework
Wei Wei | Jon Atle Gulla
Proceedings of 5th International Joint Conference on Natural Language Processing

2010

pdf bib
Sentiment Learning on Product Reviews via Sentiment Ontology Tree
Wei Wei | Jon Atle Gulla
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

1996

pdf bib
A Sign Expansion Approach to Dynamic, Multi-Purpose Lexicons
Jon Atle Gulla | Sjur Nørstebø Moshagen
COLING 1996 Volume 1: The 16th International Conference on Computational Linguistics