Prediction for the Newsroom: Which Articles Will Get the Most Comments?

Carl Ambroselli, Julian Risch, Ralf Krestel, Andreas Loos


Abstract
The overwhelming success of the Web and mobile technologies has enabled millions to share their opinions publicly at any time. But the same success also endangers this freedom of speech due to closing down of participatory sites misused by individuals or interest groups. We propose to support manual moderation by proactively drawing the attention of our moderators to article discussions that most likely need their intervention. To this end, we predict which articles will receive a high number of comments. In contrast to existing work, we enrich the article with metadata, extract semantic and linguistic features, and exploit annotated data from a foreign language corpus. Our logistic regression model improves F1-scores by over 80% in comparison to state-of-the-art approaches.
Anthology ID:
N18-3024
Volume:
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 3 (Industry Papers)
Month:
June
Year:
2018
Address:
New Orleans - Louisiana
Editors:
Srinivas Bangalore, Jennifer Chu-Carroll, Yunyao Li
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
193–199
Language:
URL:
https://aclanthology.org/N18-3024
DOI:
10.18653/v1/N18-3024
Bibkey:
Cite (ACL):
Carl Ambroselli, Julian Risch, Ralf Krestel, and Andreas Loos. 2018. Prediction for the Newsroom: Which Articles Will Get the Most Comments?. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 3 (Industry Papers), pages 193–199, New Orleans - Louisiana. Association for Computational Linguistics.
Cite (Informal):
Prediction for the Newsroom: Which Articles Will Get the Most Comments? (Ambroselli et al., NAACL 2018)
Copy Citation:
PDF:
https://aclanthology.org/N18-3024.pdf
Video:
 https://aclanthology.org/N18-3024.mp4
Code
 julian-risch/NAACL2018