Proceedings of the CRAC 2023 Shared Task on Multilingual Coreference Resolution

Zdeněk Žabokrtský, Maciej Ogrodniczuk (Editors)

Anthology ID:: 2023.crac-sharedtask
Month:: December
Year:: 2023
Address:: Singapore
Venues:: CRAC | WS
SIG:
Publisher:: Association for Computational Linguistics
URL:: https://aclanthology.org/2023.crac-sharedtask/
DOI:
Bib Export formats:: BibTeX MODS XML EndNote
PDF:: https://aclanthology.org/2023.crac-sharedtask.pdf

pdf bib
Proceedings of the CRAC 2023 Shared Task on Multilingual Coreference Resolution
Zdeněk Žabokrtský | Maciej Ogrodniczuk

This paper summarizes the second edition of the shared task on multilingual coreference resolution, held with the CRAC 2023 workshop. Just like last year, participants of the shared task were to create trainable systems that detect mentions and group them based on identity coreference; however, this year’s edition uses a slightly different primary evaluation score, and is also broader in terms of covered languages: version 1.1 of the multilingual collection of harmonized coreference resources CorefUD was used as the source of training and evaluation data this time, with 17 datasets for 12 languages. 7 systems competed in this shared task.

pdf bib abs
Multilingual coreference resolution: Adapt and Generate
Natalia Skachkova | Tatiana Anikina | Anna Mokhova

The paper presents two multilingual coreference resolution systems submitted for the CRAC Shared Task 2023. The DFKI-Adapt system achieves 61.86 F1 score on the shared task test data, outperforming the official baseline by 4.9 F1 points. This system uses a combination of different features and training settings, including character embeddings, adapter modules, joint pre-training and loss-based re-training. We provide evaluation for each of the settings on 12 different datasets and compare the results. The other submission DFKI-MPrompt uses a novel approach that involves prompting for mention generation. Although the scores achieved by this model are lower compared to the baseline, the method shows a new way of approaching the coreference task and provides good results with just five epochs of training.

pdf bib abs
Neural End-to-End Coreference Resolution using Morphological Information
Tuğba Pamay Arslan | Kutay Acar | Gülşen Eryiğit

In morphologically rich languages, words consist of morphemes containing deeper information in morphology, and thus such languages may necessitate the use of morpheme-level representations as well as word representations. This study introduces a neural multilingual end-to-end coreference resolution system by incorporating morphological information in transformer-based word embeddings on the baseline model. This proposed model participated in the Sixth Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC 2023). Including morphological information explicitly into the coreference resolution improves the performance, especially in morphologically rich languages (e.g., Catalan, Hungarian, and Turkish). The introduced model outperforms the baseline system by 2.57 percentage points on average by obtaining 59.53% CoNLL F-score.

pdf bib abs
ÚFAL CorPipe at CRAC 2023: Larger Context Improves Multilingual Coreference Resolution
Milan Straka

We present CorPipe, the winning entry to the CRAC 2023 Shared Task on Multilingual Coreference Resolution. Our system is an improved version of our earlier multilingual coreference pipeline, and it surpasses other participants by a large margin of 4.5 percent points. CorPipe first performs mention detection, followed by coreference linking via an antecedent-maximization approach on the retrieved spans. Both tasks are trained jointly on all available corpora using a shared pretrained language model. Our main improvements comprise inputs larger than 512 subwords and changing the mention decoding to support ensembling. The source code is available at https://github.com/ufal/crac2023-corpipe.

pdf bib abs
McGill at CRAC 2023: Multilingual Generalization of Entity-Ranking Coreference Resolution Models
Ian Porada | Jackie Chi Kit Cheung

Our submission to the CRAC 2023 shared task, described herein, is an adapted entity-ranking model jointly trained on all 17 datasets spanning 12 languages. Our model outperforms the shared task baselines by a difference in F1 score of +8.47, achieving an ultimate F1 score of 65.43 and fourth place in the shared task. We explore design decisions related to data preprocessing, the pretrained encoder, and data mixing.