Aitor Álvarez

Also published as: Aitor Arronte Álvarez

2025

This demo paper presents a prototype of a multilingual, speech-based driver assistant, designed to support both English and Basque languages. The inclusion of Basque—a low-resource language with limited domain-specific training data—marks a significant contribution, as publicly available AI models, including Large Language Models, often underperform for such languages compared to high-resource languages like English. Despite these challenges, our system demonstrates robust performance, successfully understanding user queries and delivering rapid responses in a demanding environment: a car simulator. Notably, the system achieves comparable performance in both English and Basque, showcasing its effectiveness in addressing linguistic disparities in AI-driven applications. A demo of our prototype will be available in the workshop.

2024

pdf bib
Real-Time Speech-Driven Avatar Animation by Predicting Facial landmarks and Deformation Blendshapes
Juan Camilo Vasquez-Correa | Santiago Moreno-Acevedo | Ander Gonzalez-Docasal | Aritz Lasarguren | Jone Lòpez | Egoitz Rodriguez | Aitor Álvarez
Proceedings of the 7th International Conference on Natural Language and Speech Processing (ICNLSP 2024)

2016

pdf bib abs
Impact of Automatic Segmentation on the Quality, Productivity and Self-reported Post-editing Effort of Intralingual Subtitles
Aitor Álvarez | Marina Balenciaga | Arantza del Pozo | Haritz Arzelus | Anna Matamala | Carlos-D. Martínez-Hinarejos
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

This paper describes the evaluation methodology followed to measure the impact of using a machine learning algorithm to automatically segment intralingual subtitles. The segmentation quality, productivity and self-reported post-editing effort achieved with such approach are shown to improve those obtained by the technique based in counting characters, mainly employed for automatic subtitle segmentation currently. The corpus used to train and test the proposed automated segmentation method is also described and shared with the community, in order to foster further research in this area.

2015

pdf bib
The reception of intralingual and interlingual automatic subtitling: An exploratory study within the HBB4ALL project
Anna Matamala | Andreu Oliver | Aitor Álvarez | Andoni Azpeitia
Proceedings of Translating and the Computer 37

pdf bib
Enriching Digitized Medieval Manuscripts: Linking Image, Text and Lexical Knowledge
Aitor Arronte Álvarez
Proceedings of the 9th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH)

2014

This paper describes the data collection, annotation and sharing activities carried out within the FP7 EU-funded SAVAS project. The project aims to collect, share and reuse audiovisual language resources from broadcasters and subtitling companies to develop large vocabulary continuous speech recognisers in specific domains and new languages, with the purpose of solving the automated subtitling needs of the media industry.

pdf bib abs
Phoneme Similarity Matrices to Improve Long Audio Alignment for Automatic Subtitling
Pablo Ruiz | Aitor Álvarez | Haritz Arzelus
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Long audio alignment systems for Spanish and English are presented, within an automatic subtitling application. Language-specific phone decoders automatically recognize audio contents at phoneme level. At the same time, language-dependent grapheme-to-phoneme modules perform a transcription of the script for the audio. A dynamic programming algorithm (Hirschberg’s algorithm) finds matches between the phonemes automatically recognized by the phone decoder and the phonemes in the scripts transcription. Alignment accuracy is evaluated when scoring alignment operations with a baseline binary matrix, and when scoring alignment operations with several continuous-score matrices, based on phoneme similarity as assessed through comparing multivalued phonological features. Alignment accuracy results are reported at phoneme, word and subtitle level. Alignment accuracy when using the continuous scoring matrices based on phonological similarity was clearly higher than when using the baseline binary matrix.