Massimiliano Pronesti
2024
Filling Gaps in Wikipedia: Leveraging Data-to-Text Generation to Improve Encyclopedic Coverage of Underrepresented Groups
Simon Mille
|
Massimiliano Pronesti
|
Craig Thomson
|
Michela Lorandi
|
Sophie Fitzpatrick
|
Rudali Huidrom
|
Mohammed Sabry
|
Amy O’Riordan
|
Anya Belz
Proceedings of the 17th International Natural Language Generation Conference: System Demonstrations
Wikipedia is known to have systematic gaps in its coverage that correspond to under-resourced languages as well as underrepresented groups. This paper presents a new tool to support efforts to fill in these gaps by automatically generating draft articles and facilitating post-editing and uploading to Wikipedia. A rule-based generator and an input-constrained LLM are used to generate two alternative articles, enabling the often more fluent, but error-prone, LLM-generated article to be content-checked against the more reliable, but less fluent, rule-generated article.
Search
Co-authors
- Simon Mille 1
- Craig Thomson 1
- Michela Lorandi 1
- Sophie Fitzpatrick 1
- Rudali Huidrom 1
- show all...
Venues
- inlg1