PENELOPIE: Enabling Open Information Extraction for the Greek Language through Machine Translation

Dimitris Papadopoulos, Nikolaos Papadakis, Nikolaos Matsatsinis


Abstract
In this work, we present a methodology that aims at bridging the gap between high and low-resource languages in the context of Open Information Extraction, showcasing it on the Greek language. The goals of this paper are twofold: First, we build Neural Machine Translation (NMT) models for English-to-Greek and Greek-to-English based on the Transformer architecture. Second, we leverage these NMT models to produce English translations of Greek text as input for our NLP pipeline, to which we apply a series of pre-processing and triple extraction tasks. Finally, we back-translate the extracted triples to Greek. We conduct an evaluation of both our NMT and OIE methods on benchmark datasets and demonstrate that our approach outperforms the current state-of-the-art for the Greek natural language.
Anthology ID:
2021.eacl-srw.4
Volume:
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop
Month:
April
Year:
2021
Address:
Online
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
23–29
Language:
URL:
https://aclanthology.org/2021.eacl-srw.4
DOI:
10.18653/v1/2021.eacl-srw.4
Bibkey:
Cite (ACL):
Dimitris Papadopoulos, Nikolaos Papadakis, and Nikolaos Matsatsinis. 2021. PENELOPIE: Enabling Open Information Extraction for the Greek Language through Machine Translation. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop, pages 23–29, Online. Association for Computational Linguistics.
Cite (Informal):
PENELOPIE: Enabling Open Information Extraction for the Greek Language through Machine Translation (Papadopoulos et al., EACL 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.eacl-srw.4.pdf
Code
 lighteternal/PENELOPIE
Data
Tatoeba