Assessing Users’ Reputation from Syntactic and Semantic Information in Community Question Answering

Yonas Woldemariam


Abstract
Textual content is the most significant as well as substantially the big part of CQA (Community Question Answering) forums. Users gain reputation for contributing such content. Although linguistic quality is the very essence of textual information, that does not seem to be considered in estimating users’ reputation. As existing users’ reputation systems seem to solely rely on vote counting, adding that bit of linguistic information surely improves their quality. In this study, we investigate the relationship between users’ reputation and linguistic features extracted from their associated answers content. And we build statistical models on a Stack Overflow dataset that learn reputation from complex syntactic and semantic structures of such content. The resulting models reveal how users’ writing styles in answering questions play important roles in building reputation points. In our experiments, extracting answers from systematically selected users followed by linguistic features annotation and models building. The models are evaluated on in-domain (e.g., Server Fault, Super User) and out-domain (e.g., English, Maths) datasets. We found out that the selected linguistic features have quite significant influences over reputation scores. In the best case scenario, the selected linguistic feature set could explain 80% variation in reputation scores with the prediction error of 3%. The performance results obtained from the baseline models have been significantly improved by adding syntactic and punctuation marks features.
Anthology ID:
2020.lrec-1.662
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
5383–5391
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.662
DOI:
Bibkey:
Cite (ACL):
Yonas Woldemariam. 2020. Assessing Users’ Reputation from Syntactic and Semantic Information in Community Question Answering. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 5383–5391, Marseille, France. European Language Resources Association.
Cite (Informal):
Assessing Users’ Reputation from Syntactic and Semantic Information in Community Question Answering (Woldemariam, LREC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.lrec-1.662.pdf