Chang Shen
2022
Surfer100: Generating Surveys From Web Resources, Wikipedia-style
Irene Li
|
Alex Fabbri
|
Rina Kawamura
|
Yixin Liu
|
Xiangru Tang
|
Jaesung Tae
|
Chang Shen
|
Sally Ma
|
Tomoe Mizutani
|
Dragomir Radev
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Fast-developing fields such as Artificial Intelligence (AI) often outpace the efforts of encyclopedic sources such as Wikipedia, which either do not completely cover recently-introduced topics or lack such content entirely. As a result, methods for automatically producing content are valuable tools to address this information overload. We show that recent advances in pretrained language modeling can be combined for a two-stage extractive and abstractive approach for Wikipedia lead paragraph generation. We extend this approach to generate longer Wikipedia-style summaries with sections and examine how such methods struggle in this application through detailed studies with 100 reference human-collected surveys. This is the first study on utilizing web resources for long Wikipedia-style summaries to the best of our knowledge.
Search
Co-authors
- Irene Li 1
- Alex Fabbri 1
- Rina Kawamura 1
- Yixin Liu 1
- Xiangru Tang 1
- show all...
Venues
- lrec1