Shweta Misra
2024
Nested Noun Phrase Identification Using BERT
Shweta Misra
|
Johan Boye
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
For several NLP tasks, an important substep is the identification of noun phrases in running text. This has typically been done by “chunking” – a way of finding minimal noun phrases by token classification. However, chunking-like methods do not represent the fact that noun phrases can be nested. This paper presents a novel method of finding all noun phrases in a sentence, nested to an arbitrary depth, using the BERT model for token classification. We show that our proposed method achieves very good results for both Swedish and English.