Tom De Keyser


2020

pdf bib
Intelligent Analyses on Storytelling for Impact Measurement
Koen Kicken | Tessa De Maesschalck | Bart Vanrumste | Tom De Keyser | Hee Reen Shim
Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020)

This paper explores how Dutch diary fragments, written by family coaches in the social sector, can be analysed automatically using machine learning techniques to quantitatively measure the impact of social coaching. The focus lays on two tasks: determining which sentiment a fragment contains (sentiment analysis) and investigating which fundamental social rights (education, employment, legal aid, etc.) are addressed in the fragment. To train and test the new algorithms, a dataset consisting of 1715 Dutch diary fragments is used. These fragments are manually labelled on sentiment and on the applicable fundamental social rights. The sentiment analysis models were trained to classify the fragments into three classes: negative, neutral or positive. Fine-tuning the Dutch pre-trained Bidirectional Encoder Representations from Transformers (BERTje) (de Vries et al., 2019) language model surpassed the more classic algorithms by correctly classifying 79.6% of the fragments on the sentiment analysis, which is considered as a good result. This technique also achieved the best results in the identification of the fundamental rights, where for every fragment the three most likely fundamental rights were given as output. In this way, 93% of the present fundamental rights were correctly recognised. To our knowledge, we are the first to try to extract social rights from written text with the help of Natural Language Processing techniques.