Noise, Novels, Numbers. A Framework for Detecting and Categorizing Noise in Danish and Norwegian Literature

Ali Al-Laith, Daniel Hershcovich, Jens Bjerring-Hansen, Jakob Parby, Alexander Conroy, Timothy Tangherlini


Abstract
We present a framework for detecting and categorizing noise in literary texts, demonstrated through its application to Danish and Norwegian literature from the late 19-th century. Noise, understood as “aberrant sonic behaviour,” is not only an auditory phenomenon but also a cultural construct tied to the processes of civilization and urbanization.We begin by utilizing topic modeling techniques to identify noise-related documents, followed by fine-tuning BERT-based language models trained on Danish and Norwegian texts to analyze a corpus of over 800 novels.We identify and track the prevalence of noise in these texts, offering insights into the literary perceptions of noise during the Scandinavian “Modern Breakthrough” period (1870-1899). Our contributions include the development of a comprehensive dataset annotated for noise-related segments and their categorization into human-made, non-human-made, and musical noises. This study illustrates the framework’s potential for enhancing the understanding of the relationship between noise and its literary representations, providing a deeper appreciation of the auditory elements in literary works, including as sources for cultural history.
Anthology ID:
2024.emnlp-main.196
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3344–3354
Language:
URL:
https://aclanthology.org/2024.emnlp-main.196
DOI:
Bibkey:
Cite (ACL):
Ali Al-Laith, Daniel Hershcovich, Jens Bjerring-Hansen, Jakob Parby, Alexander Conroy, and Timothy Tangherlini. 2024. Noise, Novels, Numbers. A Framework for Detecting and Categorizing Noise in Danish and Norwegian Literature. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 3344–3354, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Noise, Novels, Numbers. A Framework for Detecting and Categorizing Noise in Danish and Norwegian Literature (Al-Laith et al., EMNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.emnlp-main.196.pdf