Proceedings of the 17th Workshop on Building and Using Comparable Corpora (BUCC) @ LREC-COLING 2024

Pierre Zweigenbaum, Reinhard Rapp, Serge Sharoff (Editors)

Anthology ID:: 2024.bucc-1
Month:: May
Year:: 2024
Address:: Torino, Italia
Venue:: BUCC
Events:: Workshop on Building and Using Comparable Corpora (2024) | The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) | The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
SIG:
Publisher:: ELRA and ICCL
URL:: https://aclanthology.org/2024.bucc-1/
DOI:
Bib Export formats:: BibTeX MODS XML EndNote
PDF:: https://aclanthology.org/2024.bucc-1.pdf

Proceedings of the 17th Workshop on Building and Using Comparable Corpora (BUCC) @ LREC-COLING 2024
Pierre Zweigenbaum | Reinhard Rapp | Serge Sharoff

pdf bib

On a Novel Application of Wasserstein-Procrustes for Unsupervised Cross-Lingual Alignment of Embeddings
Guillem Ramírez | Rumen Dangovski | Preslav Nakov | Marin Soljacic

pdf bib

pdf bib

pdf bib

Invited Talk: The Way Towards Massively Multilingual Language Models
François Yvon

pdf bib

pdf bib

Exploring the Potential of Large Language Models in Adaptive Machine Translation for Generic Text and Subtitles
Abdelhadi Soudi | Mohamed Hannani | Kristof Van Laerhoven | Eleftherios Avramidis

pdf bib

INCLURE: a Dataset and Toolkit for Inclusive French Translation
Paul Lerner | Cyril Grouin

pdf bib

pdf bib

Creating Clustered Comparable Corpora from Wikipedia with Different Fuzziness Levels and Language Representativity
Anna Laskina | Eric Gaussier | Gaelle Calvary

pdf bib

EuReCo: Not Building and Yet Using Federated Comparable Corpora for Cross-Linguistic Research
Marc Kupietz | Piotr Banski | Nils Diewald | Beata Trawinski | Andreas Witt

pdf bib

Building Annotated Parallel Corpora Using the ATIS Dataset: Two UD-style treebanks in English and Turkish
Neslihan Cesur | Aslı Kuzgun | Mehmet Kose | Olcay Taner Yıldız

pdf bib

Bootstrapping the Annotation of UD Learner Treebanks
Arianna Masciolini

pdf bib

SweDiagnostics: A Diagnostics Natural Language Inference Dataset for Swedish
Felix Morger

pdf bib

Multiple Discourse Relations in English TED Talks and Their Translation into Lithuanian, Portuguese and Turkish
Deniz Zeyrek | Giedrė Valūnaitė Oleškevičienė | Amalia Mendes

pdf bib

mini-CIEP+ : A Shareable Parallel Corpus of Prose
Annemarie Verkerk | Luigi Talamo