French Coreference for Spoken and Written Language

Rodrigo Wilkens, Bruno Oberle, Frédéric Landragin, Amalia Todirascu


Abstract
Coreference resolution aims at identifying and grouping all mentions referring to the same entity. In French, most systems run different setups, making their comparison difficult. In this paper, we present an extensive comparison of several coreference resolution systems for French. The systems have been trained on two corpora (ANCOR for spoken language and Democrat for written language) annotated with coreference chains, and augmented with syntactic and semantic information. The models are compared with different configurations (e.g. with and without singletons). In addition, we evaluate mention detection and coreference resolution apart. We present a full-stack model that outperforms other approaches. This model allows us to study the impact of mention detection errors on coreference resolution. Our analysis shows that mention detection can be improved by focusing on boundary identification while advances in the pronoun-noun relation detection can help the coreference task. Another contribution of this work is the first end-to-end neural French coreference resolution model trained on Democrat (written texts), which compares to the state-of-the-art systems for oral French.
Anthology ID:
2020.lrec-1.10
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
80–89
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.10
DOI:
Bibkey:
Cite (ACL):
Rodrigo Wilkens, Bruno Oberle, Frédéric Landragin, and Amalia Todirascu. 2020. French Coreference for Spoken and Written Language. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 80–89, Marseille, France. European Language Resources Association.
Cite (Informal):
French Coreference for Spoken and Written Language (Wilkens et al., LREC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.lrec-1.10.pdf