John Vidler


2025

pdf bib
FreeTxt: Analyse and Visualise Multilingual Qualitative Survey Data for Cultural Heritage Sites
Nouran Khallaf | Ignatius Ezeani | Dawn Knight | Paul Rayson | Mo El-Haj | John Vidler | James Davies | Fernando Alva-Manchego
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era

We introduce FreeTxt, a free and open-source web-based tool designed to support the analysis and visualisation of multilingual qualitative survey data, with a focus on low-resource languages. Developed in collaboration with stakeholders, FreeTxt integrates established techniques from corpus linguistics with modern natural language processing methods in an intuitive interface accessible to non-specialists. The tool currently supports bilingual processing and visualisation of English and Welsh responses, with ongoing extensions to other languages such as Vietnamese. Key functionalities include semantic tagging via PyMUSAS, multilingual sentiment analysis, keyword and collocation visualisation, and extractive summarisation. User evaluations with cultural heritage institutions demonstrate the system’s utility and potential for broader impact.

pdf bib
SENTimental - a Simple Multilingual Sentiment Annotation Tool
John Vidler | Paul Rayson | Dawn Knight
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era

Here we present SENTimental, a simple and fast web-based, mobile-friendly tool for capturing sentiment annotations from participants and citizen scientist volunteers to create training and testing data for low-resource languages. In contrast to existing tools, we focus on assigning broad values to segments of text over specific tags for tokens or spans to build datasets for training and testing LLMs. The SENTimental interface minimises barriers to entry with a goal of maximising the time a user spends in a flow state whereby they are able to quickly and accurately rate each text fragment without being distracted by the complexity of the interface. Designed from the outset to handle multilingual representations, SENTimental allows for parallel corpus data to be presented to the user and switched between instantly for immediate comparison. As such this allows for users in any loaded languages to contribute to the data gathered, building up comparable rankings in a simple structured dataset for later processing.

2024

pdf bib
Exploring the Suitability of Transformer Models to Analyse Mental Health Peer Support Forum Data for a Realist Evaluation
Matthew Coole | Paul Rayson | Zoe Glossop | Fiona Lobban | Paul Marshall | John Vidler
Proceedings of the First Workshop on Patient-Oriented Language Processing (CL4Health) @ LREC-COLING 2024

Mental health peer support forums have become widely used in recent years. The emerging mental health crisis and the COVID-19 pandemic have meant that finding a place online for support and advice when dealing with mental health issues is more critical than ever. The need to examine, understand and find ways to improve the support provided by mental health forums is vital in the current climate. As part of this, we present our initial explorations in using modern transformer models to detect four key concepts (connectedness, lived experience, empathy and gratitude), which we believe are essential to understanding how people use mental health forums and will serve as a basis for testing more expansive realise theories about mental health forums in the future. As part of this work, we also replicate previously published results on empathy utilising an existing annotated dataset and test the other concepts on our manually annotated mental health forum posts dataset. These results serve as a basis for future research examining peer support forums.