Doris Hoogeveen


pdf bib
Preferred Answer Selection in Stack Overflow: Better Text Representations ... and Metadata, Metadata, Metadata
Steven Xu | Andrew Bennett | Doris Hoogeveen | Jey Han Lau | Timothy Baldwin
Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text

Community question answering (cQA) forums provide a rich source of data for facilitating non-factoid question answering over many technical domains. Given this, there is considerable interest in answer retrieval from these kinds of forums. However this is a difficult task as the structure of these forums is very rich, and both metadata and text features are important for successful retrieval. While there has recently been a lot of work on solving this problem using deep learning models applied to question/answer text, this work has not looked at how to make use of the rich metadata available in cQA forums. We propose an attention-based model which achieves state-of-the-art results for text-based answer selection alone, and by making use of complementary meta-data, achieves a substantially higher result over two reference datasets novel to this work.


pdf bib
SemEval-2017 Task 3: Community Question Answering
Preslav Nakov | Doris Hoogeveen | Lluís Màrquez | Alessandro Moschitti | Hamdy Mubarak | Timothy Baldwin | Karin Verspoor
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

We describe SemEval–2017 Task 3 on Community Question Answering. This year, we reran the four subtasks from SemEval-2016: (A) Question–Comment Similarity, (B) Question–Question Similarity, (C) Question–External Comment Similarity, and (D) Rerank the correct answers for a new question in Arabic, providing all the data from 2015 and 2016 for training, and fresh data for testing. Additionally, we added a new subtask E in order to enable experimentation with Multi-domain Question Duplicate Detection in a larger-scale scenario, using StackExchange subforums. A total of 23 teams participated in the task, and submitted a total of 85 runs (36 primary and 49 contrastive) for subtasks A–D. Unfortunately, no teams participated in subtask E. A variety of approaches and features were used by the participating systems to address the different subtasks. The best systems achieved an official score (MAP) of 88.43, 47.22, 15.46, and 61.16 in subtasks A, B, C, and D, respectively. These scores are better than the baselines, especially for subtasks A–C.


pdf bib
UniMelb at SemEval-2016 Task 3: Identifying Similar Questions by combining a CNN with String Similarity Measures
Timothy Baldwin | Huizhi Liang | Bahar Salehi | Doris Hoogeveen | Yitong Li | Long Duong
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)