Olatz Arregi


2020

pdf bib
Sequence to Sequence Coreference Resolution
Gorka Urbizu | Ander Soraluze | Olatz Arregi
Proceedings of the Third Workshop on Computational Models of Reference, Anaphora and Coreference

Until recently, coreference resolution has been a critical task on the pipeline of any NLP task involving deep language understanding, such as machine translation, chatbots, summarization or sentiment analysis. However, nowadays, those end tasks are learned end-to-end by deep neural networks without adding any explicit knowledge about coreference. Thus, coreference resolution is used less in the training of other NLP tasks or trending pretrained language models. In this paper we present a new approach to face coreference resolution as a sequence to sequence task based on the Transformer architecture. This approach is simple and universal, compatible with any language or dataset (regardless of singletons) and easier to integrate with current language models architectures. We test it on the ARRAU corpus, where we get 65.6 F1 CoNLL. We see this approach not as a final goal, but a means to pretrain sequence to sequence language models (T5) on coreference resolution.

2019

pdf bib
Deep Cross-Lingual Coreference Resolution for Less-Resourced Languages: The Case of Basque
Gorka Urbizu | Ander Soraluze | Olatz Arregi
Proceedings of the Second Workshop on Computational Models of Reference, Anaphora and Coreference

In this paper, we present a cross-lingual neural coreference resolution system for a less-resourced language such as Basque. To begin with, we build the first neural coreference resolution system for Basque, training it with the relatively small EPEC-KORREF corpus (45,000 words). Next, a cross-lingual coreference resolution system is designed. With this approach, the system learns from a bigger English corpus, using cross-lingual embeddings, to perform the coreference resolution for Basque. The cross-lingual system obtains slightly better results (40.93 F1 CoNLL) than the monolingual system (39.12 F1 CoNLL), without using any Basque language corpus to train it.

2017

pdf bib
Enriching Basque Coreference Resolution System using Semantic Knowledge sources
Ander Soraluze | Olatz Arregi | Xabier Arregi | Arantza Díaz de Ilarraza
Proceedings of the 2nd Workshop on Coreference Resolution Beyond OntoNotes (CORBON 2017)

In this paper we present a Basque coreference resolution system enriched with semantic knowledge. An error analysis carried out revealed the deficiencies that the system had in resolving coreference cases in which semantic or world knowledge is needed. We attempt to improve the deficiencies using two semantic knowledge sources, specifically Wikipedia and WordNet.

2016

pdf bib
Coreference Resolution for the Basque Language with BART
Ander Soraluze | Olatz Arregi | Xabier Arregi | Arantza Díaz de Ilarraza | Mijail Kabadjov | Massimo Poesio
Proceedings of the Workshop on Coreference Resolution Beyond OntoNotes (CORBON 2016)

2015

pdf bib
bRol: The Parser of Syntactic and Semantic Dependencies for Basque
Haritz Salaberri | Olatz Arregi | Beñat Zapirain
Proceedings of the International Conference Recent Advances in Natural Language Processing

pdf bib
IXAGroupEHUDiac: A Multiple Approach System towards the Diachronic Evaluation of Texts
Haritz Salaberri | Iker Salaberri | Olatz Arregi | Beñat Zapirain
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

pdf bib
IXAGroupEHUSpaceEval: (X-Space) A WordNet-based approach towards the Automatic Recognition of Spatial Information following the ISO-Space Annotation Scheme
Haritz Salaberri | Olatz Arregi | Beñat Zapirain
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

pdf bib
A Multi-classifier Approach to support Coreference Resolution in a Vector Space Model
Ana Zelaia | Olatz Arregi | Basilio Sierra
Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing

2014

pdf bib
First approach toward Semantic Role Labeling for Basque
Haritz Salaberri | Olatz Arregi | Beñat Zapirain
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In this paper, we present the first Semantic Role Labeling system developed for Basque. The system is implemented using machine learning techniques and trained with the Reference Corpus for the Processing of Basque (EPEC). In our experiments the classifier that offers the best results is based on Support Vector Machines. Our system achieves 84.30 F1 score in identifying the PropBank semantic role for a given constituent and 82.90 F1 score in identifying the VerbNet role. Our study establishes a baseline for Basque SRL. Although there are no directly comparable systems for English we can state that the results we have achieved are quite good. In addition, we have performed a Leave-One-Out feature selection procedure in order to establish which features are the worthiest regarding argument classification. This will help smooth the way for future stages of Basque SRL and will help draw some of the guidelines of our research.

2011

pdf bib
Recognition and Classification of Numerical Entities in Basque
Ander Soraluze | Iñaki Alegria | Olatz Ansa | Olatz Arregi | Xabier Arregi
Proceedings of the International Conference Recent Advances in Natural Language Processing 2011

2009

pdf bib
A Multiclassifier based Approach for Word Sense Disambiguation using Singular Value Decomposition
Ana Zelaia | Olatz Arregi | Basilio Sierra
Proceedings of the Eight International Conference on Computational Semantics

2007

pdf bib
UBC-ZAS: A k-NN based Multiclassifier System to perform WSD in a Reduced Dimensional Vector Space
Ana Zelaia | Olatz Arregi | Basilio Sierra
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)

2006

pdf bib
A Multiclassifier based Document Categorization System: profiting from the Singular Value Decomposition Dimensionality Reduction Technique
Ana Zelaia | Iñaki Alegria | Olatz Arregi | Basilio Sierra
Proceedings of the Workshop on Learning Structured Information in Natural Language Applications