Peter Mutschke

2018

Towards a Gold Standard Corpus for Variable Detection and Linking in Social Science Publications
Andrea Zielinski | Peter Mutschke
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2017

pdf bib abs

Mining Social Science Publications for Survey Variables
Andrea Zielinski | Peter Mutschke
Proceedings of the Second Workshop on NLP and Computational Social Science

Research in Social Science is usually based on survey data where individual research questions relate to observable concepts (variables). However, due to a lack of standards for data citations a reliable identification of the variables used is often difficult. In this paper, we present a work-in-progress study that seeks to provide a solution to the variable detection task based on supervised machine learning algorithms, using a linguistic analysis pipeline to extract a rich feature set, including terminological concepts and similarity metric scores. Further, we present preliminary results on a small dataset that has been specifically designed for this task, yielding a significant increase in performance over the random baseline.

Co-authors

Andrea Zielinski 2

Venues

LREC1
NLP+CSS1

Fix author