Rayan Ziane
2025
Explicit Edge Length Coding to Improve Long Sentence Parsing Performance
Khensa Daoudi
|
Mathieu Dehouck
|
Rayan Ziane
|
Natasha Romanova
Proceedings of the First Workshop on Advancing NLP for Low-Resource Languages
Performance of syntactic parsers is reduced for longer sentences. While some of this reduction can be explained by the tendency of longer sentences to be more syntactically complex as well as the increase of candidate governor number, some of it is due to longer sentences being more challenging to encode. This is especially relevant for low-resource scenarios such as parsing of written sources in historical languages (e.g. medieval and early-modern European languages), in particular legal texts, where sentences can be very long whereas the amount of training material remains limited. In this paper, we present a new method for explicitly using the arc length information in order to bias the scores produced by a graph-based parser. With a series of experiments on Norman and Gascon data, in which we divide the test data according to sentence length, we show that indeed explicit length coding is beneficial to retain parsing performance for longer sentences.
2021
A morph-based and a word-based treebank for Beja
Sylvain Kahane
|
Martine Vanhove
|
Rayan Ziane
|
Bruno Guillaume
Proceedings of the 20th International Workshop on Treebanks and Linguistic Theories (TLT, SyntaxFest 2021)
Search
Fix author
Co-authors
- Khensa Daoudi 1
- Mathieu Dehouck 1
- Bruno Guillaume 1
- Sylvain Kahane 1
- Natasha Romanova 1
- show all...