ÚFAL CorPipe at CRAC 2022: Effectivity of Multilingual Models for Coreference Resolution

Milan Straka, Jana Straková


Abstract
We describe the winning submission to the CRAC 2022 Shared Task on Multilingual Coreference Resolution. Our system first solves mention detection and then coreference linking on the retrieved spans with an antecedent-maximization approach, and both tasks are fine-tuned jointly with shared Transformer weights. We report results of finetuning a wide range of pretrained models. The center of this contribution are fine-tuned multilingual models. We found one large multilingual model with sufficiently large encoder to increase performance on all datasets across the board, with the benefit not limited only to the underrepresented languages or groups of typologically relative languages. The source code is available at https://github.com/ufal/crac2022-corpipe.
Anthology ID:
2022.crac-mcr.4
Volume:
Proceedings of the CRAC 2022 Shared Task on Multilingual Coreference Resolution
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Editors:
Zdeněk Žabokrtský, Maciej Ogrodniczuk
Venue:
CRAC
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
28–37
Language:
URL:
https://aclanthology.org/2022.crac-mcr.4
DOI:
Bibkey:
Cite (ACL):
Milan Straka and Jana Straková. 2022. ÚFAL CorPipe at CRAC 2022: Effectivity of Multilingual Models for Coreference Resolution. In Proceedings of the CRAC 2022 Shared Task on Multilingual Coreference Resolution, pages 28–37, Gyeongju, Republic of Korea. Association for Computational Linguistics.
Cite (Informal):
ÚFAL CorPipe at CRAC 2022: Effectivity of Multilingual Models for Coreference Resolution (Straka & Straková, CRAC 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.crac-mcr.4.pdf
Code
 ufal/crac2022-corpipe