Barbara Heinisch

2025

Terminologists as Stewards of Meaning in the Age of LLMs: A Digital Humanism Perspective
Barbara Heinisch
Proceedings of the 2nd LUHME Workshop

Digital Humanism calls for a reconfiguration of the development of digital technologies that embeds interdisciplinary collaboration, ethical reflexivity and critical scrutiny into both the design and evaluation of these systems. From a Digital Humanism perspective, terminologists play a vital role in safeguarding language understanding in specialized domains where clarity and consistency are critical (in both monolingual and multilingual contexts). This conceptual paper, therefore, examines the role of terminologists (and terminology) in the era of LLMs, with a focus on their function as stewards of meaning in specialized communication. The study draws on the principles of Digital Humanism to critically assess how terminologists can counteract various ethically and epistemologically problematic features characterizing current LLM development and deployment. In this regard, terminologists can ensure terminological precision, help preserve linguistic diversity and knowledge excluded in LLMs. They may also support inclusive, transparent and accountable digital infrastructures. By documenting system and variety-specific terms, they counteract the homogenizing tendencies of LLMs and challenge epistemic monopolies. Their expertise bridges disciplines and reinforces that language is not neutral, but culturally and institutionally embedded. As educators and stewards of meaning, terminologists empower users to critically engage with LLM outputs, ensuring that language technologies remain ethically grounded and responsive to human contexts and values.

2022

pdf bib abs

The Influence of Intrinsic and Extrinsic Motivation on the Creation of Language Resources in a Citizen Linguistics Project about Lexicography
Barbara Heinisch
Proceedings of the 2nd Workshop on Novel Incentives in Data Collection from People: models, implementations, challenges and results within LREC 2022

In the field of citizen linguistics, various initiatives are aimed at the creation of language resources by members of the public. To recruit and retain these participants different incentives informed by different motivations, extrinsic and intrinsic ones, play a role at different project stages. Illustrated by a project in the field of lexicography which draws on the extrinsic and/or intrinsic motivation of participants, the complexity of providing the ‘right’ incentives is addressed. This complexity does not only surface when considering cultural differences and the heterogeneity of the motivations participants might have but also through the changing motivations over time. Here, identifying target groups may help to guide recruitment, retention and dissemination activities. In addition, continuous adaptations may be required during the course of the project to strike a balance between necessary and feasible incentives.

2021

pdf bib

Transforming Term Extraction: Transformer-Based Approaches to Multilingual Term Extraction Across Domains
Christian Lang | Lennart Wachowiak | Barbara Heinisch | Dagmar Gromann
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2020

pdf bib abs

Developing Language Resources with Citizen Linguistics in Austria – A Case Study
Barbara Heinisch
Proceedings of the LREC 2020 Workshop on "Citizen Linguistics in Language Resource Development"

Language resources are a major ingredient for the advancement of language technologies. Citizen linguistics can help to create language resources and annotate language resources, not only for the improvement of language technologies, such as machine translation but also for the advancement of linguistic research. The (language) resources covered in this article are a corpus related to the Question of the Month project strand, which was initially aimed at co-creation in citizen linguistics and a partially annotated database of pictures of written text in different languages found in the public sphere. The number of participants in these project strands differed significantly. Especially those activities that were related to data collection (and analysis) had a significantly higher number of contributions per participant. This especially held true for the activities with (prize) incentives. Nevertheless, the activities of the Question of the Month could reach a higher number of participants, even after the co-creation approach was no longer followed. In addition, the Question of the Month brought research gaps and new knowledge to light and challenged existing paradigms and practices. These are especially important for the advancement of scholarly research. Citizen linguistics can help gather and analyze linguistic data, including language resources, in a short period of time. Thus, it may help increase the access to and availability of language resources.

pdf bib abs

CogALex-VI Shared Task: Transrelation - A Robust Multilingual Language Model for Multilingual Relation Identification
Lennart Wachowiak | Christian Lang | Barbara Heinisch | Dagmar Gromann
Proceedings of the Workshop on the Cognitive Aspects of the Lexicon

We describe our submission to the CogALex-VI shared task on the identification of multilingual paradigmatic relations building on XLM-RoBERTa (XLM-R), a robustly optimized and multilingual BERT model. In spite of several experiments with data augmentation, data addition and ensemble methods with a Siamese Triple Net, Translrelation, the XLM-R model with a linear classifier adapted to this specific task, performed best in testing and achieved the best results in the final evaluation of the shared task, even for a previously unseen language.

pdf bib abs

The Austrian Language Resource Portal for the Use and Provision of Language Resources in a Language Variety by Public Administration – a Showcase for Collaboration between Public Administration and a University
Barbara Heinisch | Vesna Lušicky
Proceedings of the 1st Workshop on Language Technologies for Government and Public Administration (LT4Gov)

The Austrian Language Resource Portal (Sprachressourcenportal Österreichs) is Austria’s central platform for language resources in the area of public administration. It focuses on language resources in the Austrian variety of the German language. As a product of the cooperation between a public administration body and a university, the Portal contains various language resources (terminological resources in the public administration domain, a language guide, named entities based on open public data, translation memories, etc.). German is a pluricentric language that considerably varies in the domain of public administration due to different public administration systems. Therefore, the Austrian Language Resource Portal stresses the importance of language resources specific to a language variety, thus paving the way for the re-use of variety-specific language data for human language technology, such as machine translation training, for the Austrian standard variety.