Diego Molla

Also published as: Diego Molla-Aliod, Diego Mollá, Diego Mollá Aliod, Diego Mollá-Aliod


2024

pdf bib
Exploring Instructive Prompts for Large Language Models in the Extraction of Evidence for Supporting Assigned Suicidal Risk Levels
Jiyu Chen | Vincent Nguyen | Xiang Dai | Diego Molla-Aliod | Cecile Paris | Sarvnaz Karimi
Proceedings of the 9th Workshop on Computational Linguistics and Clinical Psychology (CLPsych 2024)

Monitoring and predicting the expression of suicidal risk in individuals’ social media posts is a central focus in clinical NLP. Yet, existing approaches frequently lack a crucial explainability component necessary for extracting evidence related to an individual’s mental health state. We describe the CSIRO Data61 team’s evidence extraction system submitted to the CLPsych 2024 shared task. The task aims to investigate the zero-shot capabilities of open-source LLM in extracting evidence regarding an individual’s assigned suicide risk level from social media discourse. The results are assessed against ground truth evidence annotated by psychological experts, with an achieved recall-oriented BERTScore of 0.919. Our findings suggest that LLMs showcase strong feasibility in the extraction of information supporting the evaluation of suicidal risk in social media discourse. Opportunities for refinement exist, notably in crafting concise and effective instructions to guide the extraction process.

pdf bib
Using Large Language Models to Evaluate Biomedical Query-Focused Summarisation
Hashem Hijazi | Diego Molla | Vincent Nguyen | Sarvnaz Karimi
Proceedings of the 23rd Workshop on Biomedical Natural Language Processing

Biomedical question-answering systems remain popular for biomedical experts interacting with the literature to answer their medical questions. However, these systems are difficult to evaluate in the absence of costly human experts. Therefore, automatic evaluation metrics are often used in this space. Traditional automatic metrics such as ROUGE or BLEU, which rely on token overlap, have shown a low correlation with humans. We present a study that uses large language models (LLMs) to automatically evaluate systems from an international challenge on biomedical semantic indexing and question answering, called BioASQ. We measure the agreement of LLM-produced scores against human judgements. We show that LLMs correlate similarly to lexical methods when using basic prompting techniques. However, by aggregating evaluators with LLMs or by fine-tuning, we find that our methods outperform the baselines by a large margin, achieving a Spearman correlation of 0.501 and 0.511, respectively.

2023

pdf bib
Exploring Causal Directions through Word Occurrences: Semi-supervised Bayesian Classification Framework
King Tao Jason Ng | Diego Molla
Proceedings of the 21st Annual Workshop of the Australasian Language Technology Association

Determining causal directions in sentences plays a critical role into understanding a cause-and-effect relationship between entities. In this paper, we show empirically that word occurrences from several Internet domains resemble the characteristics of causal directions. Our research contributes to the knowledge of the underlying data generation process behind causal directions. We propose a two-phase method: 1. Bayesian framework, which generates synthetic data from posteriors by incorporating word occurrences from the Internet domains. 2. Pre-trained BERT, which utilises semantics of words based on the context to perform classification. The proposed method achieves an improvement in performance for the Cause-Effect relations of the SemEval-2010 dataset, when compared with random guessing.

pdf bib
Overview of the 2023 ALTA Shared Task: Discriminate between Human-Written and Machine-Generated Text
Diego Molla | Haolan Zhan | Xuanli He | Qiongkai Xu
Proceedings of the 21st Annual Workshop of the Australasian Language Technology Association

The ALTA shared tasks have been running annually since 2010. In 2023, the purpose of the task is to build automatic detection systems that can discriminate between human-written and synthetic text generated by Large Language Models (LLM). In this paper we present the task, the evaluation criteria, and the results of the systems participating in the shared task.

pdf bib
Synthetic Dialogue Dataset Generation using LLM Agents
Yelaman Abdullin | Diego Molla | Bahadorreza Ofoghi | John Yearwood | Qingyang Li
Proceedings of the Third Workshop on Natural Language Generation, Evaluation, and Metrics (GEM)

Linear programming (LP) problems are pervasive in real-life applications. However, despite their apparent simplicity, an untrained user may find it difficult to determine the linear model of their specific problem. We envisage the creation of a goal-oriented conversational agent that will engage in conversation with the user to elicit all information required so that a subsequent agent can generate the linear model. In this paper, we present an approach for the generation of sample dialogues that can be used to develop and train such a conversational agent. Using prompt engineering, we develop two agents that “talk” to each other, one acting as the conversational agent, and the other acting as the user. Using a set of text descriptions of linear problems from NL4Opt available to the user only, the agent and the user engage in conversation until the agent has retrieved all key information from the original problem description. We also propose an extrinsic evaluation of the dialogues by assessing how well the summaries generated by the dialogues match the original problem descriptions. We conduct human and automatic evaluations, including an evaluation approach that uses GPT-4 to mimic the human evaluation metrics. The evaluation results show an overall good quality of the dialogues, though research is still needed to improve the quality of the GPT-4 evaluation metrics. The resulting dialogues, including the human annotations of a subset, are available to the research community. The conversational agent used for the generation of the dialogues can be used as a baseline.

2022

pdf bib
Overview of the 2022 ALTA Shared task: PIBOSO sentence classification, 10 years later
Diego Mollá
Proceedings of the 20th Annual Workshop of the Australasian Language Technology Association

The 2022 ALTA shared task has been running annually since 2010. This year, the shared task is a re-visit of the 2012 ALTA shared task. The purpose of this task is to classify sentences of medical publications using the PIBOSO taxonomy. This is a multi-label classification task which can help medical researchers and practitioners conduct Evidence Based Medicine (EBM). In this paper we present the task, the evaluation criteria, and the results of the systems participating in the shared task.

pdf bib
The Construction and Evaluation of the LEAFTOP Dataset of Automatically Extracted Nouns in 1480 Languages
Gregory Baker | Diego Molla
Proceedings of the Thirteenth Language Resources and Evaluation Conference

The LEAFTOP (language extracted automatically from thousands of passages) dataset consists of nouns that appear in multiple places in the four gospels of the New Testament. We use a naive approach — probabilistic inference — to identify likely translations in 1480 other languages. We evaluate this process and find that it provides lexiconaries with accuracy from 42% (Korafe) to 99% (Runyankole), averaging 72% correct across evaluated languages. The process translates up to 161 distinct lemmas from Koine Greek (average 159). We identify nouns which appear to be easy and hard to translate, language families where this technique works, and future possible improvements and extensions. The claims to novelty are: the use of a Koine Greek New Testament as the source language; using a fully-annotated manually-created grammatically parse of the source text; a custom scraper for texts in the target languages; a new metric for language similarity; a novel strategy for evaluation on low-resource languages.

pdf bib
Number Theory Meets Linguistics: Modelling Noun Pluralisation Across 1497 Languages Using 2-adic Metrics
Gregory Baker | Diego Molla
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

A simple machine learning model of pluralisation as a linear regression problem minimising a p-adic metric substantially outperforms even the most robust of Euclidean-space regressors on languages in the Indo-European, Austronesian, Trans New-Guinea, Sino-Tibetan, Nilo-Saharan, Oto-Meanguean and Atlantic-Congo language families. There is insufficient evidence to support modelling distinct noun declensions as a p-adic neighbourhood even in Indo-European languages.

2021

pdf bib
Overview of the 2021 ALTA Shared Task: Automatic Grading of Evidence, 10 years later
Diego Mollá
Proceedings of the 19th Annual Workshop of the Australasian Language Technology Association

The 2021 ALTA shared task is the 12th instance of a series of shared tasks organised by ALTA since 2010. Motivated by the advances in machine learning in the last 10 years, this year�s task is a re-visit of the 2011 ALTA shared task. Set within the framework of Evidence Based Medicine (EBM), the goal is to predict the qual-ity of the clinical evidence present in a set of documents. This year�s participant results didnot improve over those of participants from 2011.

pdf bib
Demonstrating the Reliability of Self-Annotated Emotion Data
Anton Malko | Cecile Paris | Andreas Duenser | Maria Kangas | Diego Molla | Ross Sparks | Stephen Wan
Proceedings of the Seventh Workshop on Computational Linguistics and Clinical Psychology: Improving Access

Vent is a specialised iOS/Android social media platform with the stated goal to encourage people to post about their feelings and explicitly label them. In this paper, we study a snapshot of more than 100 million messages obtained from the developers of Vent, together with the labels assigned by the authors of the messages. We establish the quality of the self-annotated data by conducting a qualitative analysis, a vocabulary based analysis, and by training and testing an emotion classifier. We conclude that the self-annotated labels of our corpus are indeed indicative of the emotional contents expressed in the text and thus can support more detailed analyses of emotion expression on social media, such as emotion trajectories and factors influencing them.

2020

pdf bib
Overview of the 2020 ALTA Shared Task: Assess Human Behaviour
Diego Mollá
Proceedings of the 18th Annual Workshop of the Australasian Language Technology Association

The 2020 ALTA shared task is the 11th in stance of a series of shared tasks organised by ALTA since 2010. The task is to classify texts posted in social media according to human judgements expressed in them. The data used for this task is a subset of SemEval 2018 AIT DISC, which has been annotated by domain experts for this task. In this paper we introduce the task, describe the data and present the results of participating systems.

2019

pdf bib
Overview of the 2019 ALTA Shared Task: Sarcasm Target Identification
Diego Molla | Aditya Joshi
Proceedings of the 17th Annual Workshop of the Australasian Language Technology Association

We present an overview of the 2019 ALTA shared task. This is the 10th of the series of shared tasks organised by ALTA since 2010. The task was to detect the target of sarcastic comments posted on social media. We intro- duce the task, describe the data and present the results of baselines and participants. This year’s shared task was particularly challenging and no participating systems improved the re- sults of our baseline.

2018

pdf bib
Overview of the 2018 ALTA Shared Task: Classifying Patent Applications
Diego Mollá | Dilesha Seneviratne
Proceedings of the Australasian Language Technology Association Workshop 2018

We present an overview of the 2018 ALTA shared task. This is the 9th of the series of shared tasks organised by ALTA since 2010. The task was to classify Australian patent classifications following the sections defined by the International Patient Classification (IPC), using data made available by IP Australia. We introduce the task, describe the data and present the results of the participating teams. Some of the participating teams outperformed state of the art.

pdf bib
Macquarie University at BioASQ 6b: Deep learning and deep reinforcement learning for query-based summarisation
Diego Mollá
Proceedings of the 6th BioASQ Workshop A challenge on large-scale biomedical semantic indexing and question answering

This paper describes Macquarie University’s contribution to the BioASQ Challenge (BioASQ 6b, Phase B). We focused on the extraction of the ideal answers, and the task was approached as an instance of query-based multi-document summarisation. In particular, this paper focuses on the experiments related to the deep learning and reinforcement learning approaches used in the submitted runs. The best run used a deep learning model under a regression-based framework. The deep learning architecture used features derived from the output of LSTM chains on word embeddings, plus features based on similarity with the query, and sentence position. The reinforcement learning approach was a proof-of-concept prototype that trained a global policy using REINFORCE. The global policy was implemented as a neural network that used tf.idf features encoding the candidate sentence, question, and context.

pdf bib
Supervised Machine Learning for Extractive Query Based Summarisation of Biomedical Data
Mandeep Kaur | Diego Mollá
Proceedings of the Ninth International Workshop on Health Text Mining and Information Analysis

The automation of text summarisation of biomedical publications is a pressing need due to the plethora of information available online. This paper explores the impact of several supervised machine learning approaches for extracting multi-document summaries for given queries. In particular, we compare classification and regression approaches for query-based extractive summarisation using data provided by the BioASQ Challenge. We tackled the problem of annotating sentences for training classification systems and show that a simple annotation approach outperforms regression-based summarisation.

2017

pdf bib
On Extending Neural Networks with Loss Ensembles for Text Classification
Hamideh Hajiabadi | Diego Molla-Aliod | Reza Monsefi
Proceedings of the Australasian Language Technology Association Workshop 2017

pdf bib
Towards the Use of Deep Reinforcement Learning with Global Policy for Query-based Extractive Summarisation
Diego Mollá-Aliod
Proceedings of the Australasian Language Technology Association Workshop 2017

pdf bib
Overview of the 2017 ALTA Shared Task: Correcting OCR Errors
Diego Mollá-Aliod | Steve Cassidy
Proceedings of the Australasian Language Technology Association Workshop 2017

pdf bib
Macquarie University at BioASQ 5b – Query-based Summarisation Techniques for Selecting the Ideal Answers
Diego Mollá
BioNLP 2017

Macquarie University’s contribution to the BioASQ challenge (Task 5b Phase B) focused on the use of query-based extractive summarisation techniques for the generation of the ideal answers. Four runs were submitted, with approaches ranging from a trivial system that selected the first n snippets, to the use of deep learning approaches under a regression framework. Our experiments and the ROUGE results of the five test batches of BioASQ indicate surprisingly good results for the trivial approach. Overall, most of our runs on the first three test batches achieved the best ROUGE-SU4 results in the challenge.

2016

pdf bib
Overview of the 2016 ALTA Shared Task: Cross-KB Coreference
Andrew Chisholm | Ben Hachey | Diego Mollá
Proceedings of the Australasian Language Technology Association Workshop 2016

pdf bib
Semi-supervised Clustering of Medical Text
Pracheta Sahoo | Asif Ekbal | Sriparna Saha | Diego Mollá | Kaushik Nandan
Proceedings of the Clinical Natural Language Processing Workshop (ClinicalNLP)

Semi-supervised clustering is an attractive alternative for traditional (unsupervised) clustering in targeted applications. By using the information of a small annotated dataset, semi-supervised clustering can produce clusters that are customized to the application domain. In this paper, we present a semi-supervised clustering technique based on a multi-objective evolutionary algorithm (NSGA-II-clus). We apply this technique to the task of clustering medical publications for Evidence Based Medicine (EBM) and observe an improvement of the results against unsupervised and other semi-supervised clustering techniques.

2015

pdf bib
Query-Based Single Document Summarization Using an Ensemble Noisy Auto-Encoder
Mahmood Yousefi Azar | Kairit Sirts | Diego Mollá Aliod | Len Hamey
Proceedings of the Australasian Language Technology Association Workshop 2015

pdf bib
Similarity Metrics for Clustering PubMed Abstracts for Evidence Based Medicine
Hamed Hassanzadeh | Diego Mollá | Tudor Groza | Anthony Nguyen | Jane Hunter
Proceedings of the Australasian Language Technology Association Workshop 2015

pdf bib
Overview of the 2015 ALTA Shared Task: Identifying French Cognates in English Text
Laurianne Sitbon | Diego Molla | Haoxing Wang
Proceedings of the Australasian Language Technology Association Workshop 2015

2014

pdf bib
Impact of Citing Papers for Summarisation of Clinical Documents
Diego Mollá | Christopher Jones | Abeed Sarker
Proceedings of the Australasian Language Technology Association Workshop 2014

pdf bib
Overview of the 2014 ALTA Shared Task: Identifying Expressions of Locations in Tweets
Diego Molla | Sarvnaz Karimi
Proceedings of the Australasian Language Technology Association Workshop 2014

2013

pdf bib
Learning from OzCLO, the Australian Computational and Linguistics Olympiad
Dominique Estival | John Henderson | Mary Laughren | Diego Mollá | Cathy Bow | Rachel Nordlinger | Verna Rieschild | Andrea C. Schalley | Alexander W. Stanley | Colette Mrowa-Hopkins
Proceedings of the Fourth Workshop on Teaching NLP and CL

pdf bib
Automatic Prediction of Evidence-based Recommendations via Sentence-level Polarity Classification
Abeed Sarker | Diego Mollá-Aliod | Cécile Paris
Proceedings of the Sixth International Joint Conference on Natural Language Processing

pdf bib
Multi-Objective Optimization for Clustering of Medical Publications
Asif Ekbal | Sriparna Saha | Diego Mollá | K Ravikumar
Proceedings of the Australasian Language Technology Association Workshop 2013 (ALTA 2013)

pdf bib
Overview of the 2013 ALTA Shared Task
Diego Molla
Proceedings of the Australasian Language Technology Association Workshop 2013 (ALTA 2013)

2012

pdf bib
Towards Two-step Multi-document Summarisation for Evidence Based Medicine: A Quantitative Analysis
Abeed Sarker | Diego Mollá-Aliod | Cécile Paris
Proceedings of the Australasian Language Technology Association Workshop 2012

pdf bib
Overview of the ALTA 2012 Shared Task
Iman Amini | David Martinez | Diego Molla
Proceedings of the Australasian Language Technology Association Workshop 2012

pdf bib
Experiments with Clustering-based Features for Sentence Classification in Medical Publications: Macquarie Test’s participation in the ALTA 2012 shared task.
Diego Mollá
Proceedings of the Australasian Language Technology Association Workshop 2012

pdf bib
Proceedings of the First International Workshop on Optimization Techniques for Human Language Technology
Pushpak Bhattacharyya | Asif Ekbal | Sriparna Saha | Mark Johnson | Diego Molla-Aliod | Mark Dras
Proceedings of the First International Workshop on Optimization Techniques for Human Language Technology

2011

pdf bib
Proceedings of the Australasian Language Technology Association Workshop 2011
Diego Molla | David Martinez
Proceedings of the Australasian Language Technology Association Workshop 2011

pdf bib
Automatic Grading of Evidence: the 2011 ALTA Shared Task
Diego Molla | Abeed Sarker
Proceedings of the Australasian Language Technology Association Workshop 2011

pdf bib
Development of a Corpus for Evidence Based Medicine Summarisation
Diego Molla | Maria Elena Santiago-Martinez
Proceedings of the Australasian Language Technology Association Workshop 2011

pdf bib
Outcome Polarity Identification of Medical Papers
Abeed Sarker | Diego Molla | Cécile Paris
Proceedings of the Australasian Language Technology Association Workshop 2011

2010

pdf bib
A Corpus for Evidence Based Medicine Summarisation
Diego Molla
Proceedings of the Australasian Language Technology Association Workshop 2010

2008

pdf bib
Indexing on Semantic Roles for Question Answering
Luiz Augusto Pizzato | Diego Mollá
Coling 2008: Proceedings of the 2nd workshop on Information Retrieval for Question Answering

2007

pdf bib
Question Answering in Restricted Domains: An Overview
Diego Mollá | José Luis Vicedo
Computational Linguistics, Volume 33, Number 1, March 2007

pdf bib
Named Entity Recognition in Question Answering of Speech Data
Diego Mollá | Menno van Zaanen | Steve Cassidy
Proceedings of the Australasian Language Technology Workshop 2007

pdf bib
Question Prediction Language Model
Luiz Augusto Pizzato | Diego Mollá
Proceedings of the Australasian Language Technology Workshop 2007

2006

pdf bib
Named Entity Recognition for Question Answering
Diego Mollá | Menno van Zaanen | Daniel Smith
Proceedings of the Australasian Language Technology Workshop 2006

pdf bib
Pseudo Relevance Feedback Using Named Entities for Question Answering
Luiz Augusto Pizzato | Diego Mollá | Cécile Paris
Proceedings of the Australasian Language Technology Workshop 2006

pdf bib
Learning of Graph-based Question Answering Rules
Diego Mollá
Proceedings of TextGraphs: the First Workshop on Graph Based Methods for Natural Language Processing

2005

pdf bib
Learning of Graph Rules for Question Answering
Diego Molla | Menno van Zaanen
Proceedings of the Australasian Language Technology Workshop 2005

pdf bib
Extracting Exact Answers using a Meta Question Answering System
Luiz Augusto Pizzato | Diego Molla
Proceedings of the Australasian Language Technology Workshop 2005

2004

pdf bib
Answerfinder: Question Answering by Combining Lexical, Syntactic and Semantic Information
Diego Molla | Mary Gardiner
Proceedings of the Australasian Language Technology Workshop 2004

2003

pdf bib
Towards semantic-based overlap measures for question-answering
Diego Mollá
Proceedings of the Australasian Language Technology Workshop 2003

pdf bib
Exploiting Paraphrases in a Question Answering System
Fabio Rinaldi | James Dowdall | Kaarel Kaljurand | Michael Hess | Diego Mollá
Proceedings of the Second International Workshop on Paraphrasing

pdf bib
Intrinsic versus Extrinsic Evaluations of Parsing Systems
Diego Mollá | Ben Hutchinson
Proceedings of the EACL 2003 Workshop on Evaluation Initiatives in Natural Language Processing: are evaluation methods, metrics and resources reusable?

2002

pdf bib
Evangelising Language Technology: A Practically-Focussed Undergraduate Program
Robert Dale | Diego Mollá Aliod | Rolf Schwitter
Proceedings of the ACL-02 Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics

2000

pdf bib
Answer Extraction Towards better Evaluations of NLP Systems
Rolf Schwitter | Diego Molla | Rachel Fournier | Michael Hess
ANLP-NAACL 2000 Workshop: Reading Comprehension Tests as Evaluation for Computer-Based Language Understanding Systems