2025
pdf
bib
abs
Where and How as Key Factors for Knowledge-Enhanced Constrained Commonsense Generation
Ivan Martinez-Murillo
|
Paloma Moreda Pozo
|
Elena Lloret
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era
This paper addresses a key limitation in Natural Language Generation (NLG) systems: their struggle with commonsense reasoning, which is essential for generating contextually appropriate and plausible text. The study proposes an approach to enhance the commonsense reasoning abilities of NLG systems by integrating external knowledge framed in a constrained commonsense generation task. The paper investigates strategies for extracting and injecting external knowledge into pre-trained models, specifically BART and T5, in both base and large configurations. Experimental results show that incorporating external knowledge extracted with a simple strategy leads to significant improvements in performance, with the models achieving 88% accuracy in generating plausible and correct sentences. When refined methods for knowledge extraction are applied, the accuracy further increases to 92%. These findings underscore the crucial role of high-quality external knowledge in enhancing the commonsense reasoning capabilities of NLG systems, suggesting that such integration is vital for advancing their performance in real-world applications.
pdf
bib
abs
Towards Intention-aligned Reviews Summarization: Enhancing LLM Outputs with Pragmatic Cues
Maria Miro Maestre
|
Robiert Sepulveda-Torres
|
Ernesto Luis Estevanell-Valladares
|
Armando Suarez Cueto
|
Elena Lloret
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era
Recent advancements in Natural Language Processing (NLP) have allowed systems to address complex tasks involving cultural knowledge, multi-step reasoning, and inference. While significant progress has been made in text summarization guided by specific instructions or stylistic cues, the integration of pragmatic aspects like communicative intentions remains underexplored, particularly in non-English languages. This study emphasizes communicative intentions as central to summary generation, classifying Spanish product reviews by intent and using prompt engineering to produce intention-aligned summaries. Results indicate challenges for large language models (LLMs) in processing extensive document clusters, with summarization accuracy heavily dependent on prior model exposure to similar intentions. Common intentions such as complimenting and criticizing are reliably handled, whereas less frequent ones like promising or questioning pose greater difficulties. These findings suggest that integrating communicative intentions into summarization tasks can significantly enhance summary relevance and clarity, thereby improving user experience in product review analysis.
pdf
bib
abs
Detecting Deception in Disinformation across Languages: The Role of Linguistic Markers
Alba Perez-Montero
|
Silvia Gargova
|
Elena Lloret
|
Paloma Moreda Pozo
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era
The unstoppable proliferation of news driven by the rise of digital media has intensified the challenge of news verification. Natural Language Processing (NLP) offers solutions, primarily through content and context analysis. Recognizing the vital role of linguistic analysis, this paper presents a multilingual study of linguistic markers for automated deceptive fake news detection across English, Spanish, and Bulgarian. We compiled datasets in these languages to extract and analyze both general and specific linguistic markers. We then performed feature selection using the SelectKBest algorithm, applying it to various classification models with different combinations of general and specific linguistic markers. The results show that Logistic Regression and Support Vector Machine classification models achieved F1-scores above 0.8 for English and Spanish. For Bulgarian, Random Forest yielded the best results with an F1-score of 0.73. While these markers demonstrate potential for transferability to other languages, results may vary due to inherent linguistic characteristics. This necessitates further experimentation, especially in low-resource languages like Bulgarian. These findings highlight the significant potential of our dataset and linguistic markers for multilingual deceptive news detection.
pdf
bib
abs
GPLSICORTEX at SemEval-2025 Task 10: Leveraging Intentions for Generating Narrative Extractions
Ivan Martinez - Murillo
|
María Miró Maestre
|
Aitana Martínez
|
Snorre Ralund
|
Elena Lloret
|
Paloma Moreda Pozo
|
Armando Suárez Cueto
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
This paper describes our approach to address the SemEval-2025 Task 10 subtask 3, which is focused on narrative extraction given news articles with a dominant narrative. We design an external knowledge injection approach to fine-tune a Flan-T5 model so the generated narrative explanations are in line with the dominant narrative determined in each text. We also incorporate pragmatic information in the form of communicative intentions, using them as external knowledge to assist the model. This ensures that the generated texts align more closely with the intended explanations and effectively convey the expected meaning. The results show that our approach ranks 3rd in the task leaderboard (0.7428 in Macro-F1) with concise and effective news explanations. The analyses highlight the importance of adding pragmatic information when training systems to generate adequate narrative extractions.
2023
pdf
bib
Proceedings of the 1st International Workshop on Multilingual, Multimodal and Multitask Language Generation
Anabela Barreiro
|
Max Silberztein
|
Elena Lloret
|
Marcin Paprzycki
Proceedings of the 1st International Workshop on Multilingual, Multimodal and Multitask Language Generation
pdf
bib
Towards an Efficient Approach for Controllable Text Generation
Iván Martínez-Murillo
|
Paloma Moreda
|
Elena Lloret
Proceedings of the 1st International Workshop on Multilingual, Multimodal and Multitask Language Generation
pdf
bib
abs
A Review of Research-Based Automatic Text Simplification Tools
Isabel Espinosa-Zaragoza
|
José Abreu-Salas
|
Elena Lloret
|
Paloma Moreda
|
Manuel Palomar
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing
In the age of knowledge, the democratisation of information facilitated through the Internet may not be as pervasive if written language poses challenges to particular sectors of the population. The objective of this paper is to present an overview of research-based automatic text simplification tools. Consequently, we describe aspects such as the language, language phenomena, language levels simplified, approaches, specific target populations these tools are created for (e.g. individuals with cognitive impairment, attention deficit, elderly people, children, language learners), and accessibility and availability considerations. The review of existing studies covering automatic text simplification tools is undergone by searching two databases: Web of Science and Scopus. The eligibility criteria involve text simplification tools with a scientific background in order to ascertain how they operate. This methodology yielded 27 text simplification tools that are further analysed. Some of the main conclusions reached with this review are the lack of resources accessible to the public, the need for customisation to foster the individual’s independence by allowing the user to select what s/he finds challenging to understand while not limiting the user’s capabilities and the need for more simplification tools in languages other than English, to mention a few.
2022
pdf
bib
abs
Multi3Generation: Multitask, Multilingual, Multimodal Language Generation
Anabela Barreiro
|
José GC de Souza
|
Albert Gatt
|
Mehul Bhatt
|
Elena Lloret
|
Aykut Erdem
|
Dimitra Gkatzia
|
Helena Moniz
|
Irene Russo
|
Fabio Kepler
|
Iacer Calixto
|
Marcin Paprzycki
|
François Portet
|
Isabelle Augenstein
|
Mirela Alhasani
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation
This paper presents the Multitask, Multilingual, Multimodal Language Generation COST Action – Multi3Generation (CA18231), an interdisciplinary network of research groups working on different aspects of language generation. This “meta-paper” will serve as reference for citations of the Action in future publications. It presents the objectives, challenges and a the links for the achieved outcomes.
2019
pdf
bib
abs
Towards Adaptive Text Summarization: How Does Compression Rate Affect Summary Readability of L2 Texts?
Tatiana Vodolazova
|
Elena Lloret
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)
This paper addresses the problem of readability of automatically generated summaries in the context of second language learning. For this we experimented with a new corpus of level-annotated simplified English texts. The texts were summarized using a total of 7 extractive and abstractive summarization systems with compression rates of 20%, 40%, 60% and 80%. We analyzed the generated summaries in terms of lexical, syntactic and length-based features of readability, and concluded that summary complexity depends on the compression rate, summarization technique and the nature of the summarized corpus. Our experiments demonstrate the importance of choosing appropriate summarization techniques that align with user’s needs and language proficiency.
pdf
bib
abs
The Impact of Rule-Based Text Generation on the Quality of Abstractive Summaries
Tatiana Vodolazova
|
Elena Lloret
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)
In this paper we describe how an abstractive text summarization method improved the informativeness of automatic summaries by integrating syntactic text simplification, subject-verb-object concept frequency scoring and a set of rules that transform text into its semantic representation. We analyzed the impact of each component of our approach on the quality of generated summaries and tested it on DUC 2002 dataset. Our experiments showed that our approach outperformed other state-of-the-art abstractive methods while maintaining acceptable linguistic quality and redundancy rate.
2017
pdf
bib
Proceedings of the MultiLing 2017 Workshop on Summarization and Summary Evaluation Across Source Types and Genres
George Giannakopoulos
|
Elena Lloret
|
John M. Conroy
|
Josef Steinberger
|
Marina Litvak
|
Peter Rankel
|
Benoit Favre
Proceedings of the MultiLing 2017 Workshop on Summarization and Summary Evaluation Across Source Types and Genres
pdf
bib
abs
MultiLing 2017 Overview
George Giannakopoulos
|
John Conroy
|
Jeff Kubina
|
Peter A. Rankel
|
Elena Lloret
|
Josef Steinberger
|
Marina Litvak
|
Benoit Favre
Proceedings of the MultiLing 2017 Workshop on Summarization and Summary Evaluation Across Source Types and Genres
In this brief report we present an overview of the MultiLing 2017 effort and workshop, as implemented within EACL 2017. MultiLing is a community-driven initiative that pushes the state-of-the-art in Automatic Summarization by providing data sets and fostering further research and development of summarization systems. This year the scope of the workshop was widened, bringing together researchers that work on summarization across sources, languages and genres. We summarize the main tasks planned and implemented this year, the contributions received, and we also provide insights on next steps.
pdf
bib
abs
Ultra-Concise Multi-genre Summarisation of Web2.0: towards Intelligent Content Generation
Elena Lloret
|
Ester Boldrini
|
Patricio Martínez-Barco
|
Manuel Palomar
Proceedings of the MultiLing 2017 Workshop on Summarization and Summary Evaluation Across Source Types and Genres
The electronic Word of Mouth has become the most powerful communication channel thanks to the wide usage of the Social Media. Our research proposes an approach towards the production of automatic ultra-concise summaries from multiple Web 2.0 sources. We exploit user-generated content from reviews and microblogs in different domains, and compile and analyse four types of ultra-concise summaries: a)positive information, b) negative information; c) both or d) objective information. The appropriateness and usefulness of our model is demonstrated by its successful results and great potential in real-life applications, thus meaning a relevant advancement of the state-of-the-art approaches.
pdf
bib
abs
Improving the Naturalness and Expressivity of Language Generation for Spanish
Cristina Barros
|
Dimitra Gkatzia
|
Elena Lloret
Proceedings of the 10th International Conference on Natural Language Generation
We present a flexible Natural Language Generation approach for Spanish, focused on the surface realisation stage, which integrates an inflection module in order to improve the naturalness and expressivity of the generated language. This inflection module inflects the verbs using an ensemble of trainable algorithms whereas the other types of words (e.g. nouns, determiners, etc) are inflected using hand-crafted rules. We show that our approach achieves 2% higher accuracy than two state-of-art inflection generation approaches. Furthermore, our proposed approach also predicts an extra feature: the inflection of the imperative mood, which was not taken into account by previous work. We also present a user evaluation, where we demonstrate that the proposed method significantly improves the perceived naturalness of the generated language.
pdf
bib
abs
Inflection Generation for Spanish Verbs using Supervised Learning
Cristina Barros
|
Dimitra Gkatzia
|
Elena Lloret
Proceedings of the First Workshop on Subword and Character Level Models in NLP
We present a novel supervised approach to inflection generation for verbs in Spanish. Our system takes as input the verb’s lemma form and the desired features such as person, number, tense, and is able to predict the appropriate grammatical conjugation. Even though our approach learns from fewer examples comparing to previous work, it is able to deal with all the Spanish moods (indicative, subjunctive and imperative) in contrast to previous work which only focuses on indicative and subjunctive moods. We show that in an intrinsic evaluation, our system achieves 99% accuracy, outperforming (although not significantly) two competitive state-of-art systems. The successful results obtained clearly indicate that our approach could be integrated into wider approaches related to text generation in Spanish.
2016
pdf
bib
Generating sets of related sentences from input seed features
Cristina Barros
|
Elena Lloret
Proceedings of the 2nd International Workshop on Natural Language Generation and the Semantic Web (WebNLG 2016)
pdf
bib
Content Selection through Paraphrase Detection: Capturing different Semantic Realisations of the Same Idea
Elena Lloret
|
Claire Gardent
Proceedings of the 2nd International Workshop on Natural Language Generation and the Semantic Web (WebNLG 2016)
pdf
bib
Analysing the Integration of Semantic Web Features for Document Planning across Genres
Marta Vicente
|
Elena Lloret
Proceedings of the 2nd International Workshop on Natural Language Generation and the Semantic Web (WebNLG 2016)
2015
pdf
bib
The University of Alicante at MultiLing 2015: approach, results and further insights
Marta Vicente
|
Óscar Alcón
|
Elena Lloret
Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue
pdf
bib
Input Seed Features for Guiding the Generation Process: A Statistical Approach for Spanish
Cristina Barros
|
Elena Lloret
Proceedings of the 15th European Workshop on Natural Language Generation (ENLG)
2011
pdf
bib
Multi-Document Summarization by Capturing the Information Users are Interested in
Elena Lloret
|
Laura Plaza
|
Ahmet Aker
Proceedings of the International Conference Recent Advances in Natural Language Processing 2011
pdf
bib
Finding the Best Approach for Multi-lingual Text Summarisation: A Comparative Analysis
Elena Lloret
|
Manuel Palomar
Proceedings of the International Conference Recent Advances in Natural Language Processing 2011
pdf
bib
Towards a Unified Approach for Opinion Question Answering and Summarization
Elena Lloret
|
Alexandra Balahur
|
Manuel Palomar
|
Andrés Montoyo
Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (WASSA 2.011)
2010
pdf
bib
Quantifying the Limits and Success of Extractive Summarization Systems Across Domains
Hakan Ceylan
|
Rada Mihalcea
|
Umut Özertem
|
Elena Lloret
|
Manuel Palomar
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
pdf
bib
Experiments on Summary-based Opinion Classification
Elena Lloret
|
Horacio Saggion
|
Manuel Palomar
Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text
2009
pdf
bib
Towards Building a Competitive Opinion Summarization System: Challenges and Keys
Elena Lloret
|
Alexandra Balahur
|
Manuel Palomar
|
Andrés Montoyo
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Student Research Workshop and Doctoral Consortium
pdf
bib
Summarizing Threads in Blogs Using Opinion Polarity
Alexandra Balahur
|
Elena Lloret
|
Ester Boldrini
|
Andrés Montoyo
|
Manuel Palomar
|
Patricio Martínez-Barco
Proceedings of the Workshop on Events in Emerging Text Types