Jan Florjanczyk
2021
Hierarchical Encoders for Modeling and Interpreting Screenplays
Gayatri Bhat
|
Avneesh Saluja
|
Melody Dye
|
Jan Florjanczyk
Proceedings of the Third Workshop on Narrative Understanding
While natural language understanding of long-form documents remains an open challenge, such documents often contain structural information that can inform the design of models encoding them. Movie scripts are an example of such richly structured text – scripts are segmented into scenes, which decompose into dialogue and descriptive components. In this work, we propose a neural architecture to encode this structure, which performs robustly on two multi-label tag classification tasks without using handcrafted features. We add a layer of insight by augmenting the encoder with an unsupervised ‘interpretability’ module, which can be used to extract and visualize narrative trajectories. Though this work specifically tackles screenplays, we discuss how the underlying approach can be generalized to a range of structured documents.
Search