Avril Gazeau


2024

pdf bib
Flexible Lexicalization in Rule-based Text Realization
Avril Gazeau | Francois Lareau
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

GenDR is a text realizer that takes as input a graph-based semantic representation and outputs the corresponding syntactic dependency trees. One of the tasks in this transduction is lexicalization, i.e., choosing the right lexical units to express a given semanteme. To do so, GenDR uses a semantic dictionary that maps semantemes to corresponding lexical units in a given language. This study aims to develop a flexible lexicalization module to automatically build a rich semantic dictionary for French. To achieve this, we tried two methods. The first one consisted in extracting information from the French Lexical Network, a large-scale French lexical resource, and adapting it to GenDR. The second one was to test a contextual neural language model’s ability to generate potential additional lexicalizations. The first method significantly broadened the coverage of GenDR, while the additional lexicalizations produced by the language model turned out to be of limited use, which brings us to the conclusion that it is not suited to perform the task we’ve asked from it.