ExpertPLM: Pre-training Expert Representation for Expert Finding

Qiyao Peng, Hongtao Liu


Abstract
Expert Finding is an important task in Community Question Answering (CQA) platforms, which could help route questions to potential users to answer. The key is to learn representations of experts based on their historical answered questions accurately. In this paper, inspired by the strong text understanding ability of Pretrained Language modelings (PLMs), we propose a pre-training and fine-tuning expert finding framework. The core is that we design an expert-level pre-training paradigm, that effectively integrates expert interest and expertise simultaneously. Specifically different from the typical corpus-level pre-training, we treat each expert as the basic pre-training unit including all the historical answered question titles of the expert, which could fully indicate the expert interests for questions. Besides, we integrate the vote score information along with each answer of the expert into the pre-training phrase to model the expert ability explicitly. Finally, we propose a novel reputation-augmented Masked Language Model (MLM) pre-training strategy to capture the expert reputation information. In this way, our method could learn expert representation comprehensively, which then will be adopted and fine-tuned in the down-streaming expert-finding task. Extensive experimental results on six real-world CQA datasets demonstrate the effectiveness of our method.
Anthology ID:
2022.findings-emnlp.74
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2022
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1043–1052
Language:
URL:
https://aclanthology.org/2022.findings-emnlp.74
DOI:
10.18653/v1/2022.findings-emnlp.74
Bibkey:
Cite (ACL):
Qiyao Peng and Hongtao Liu. 2022. ExpertPLM: Pre-training Expert Representation for Expert Finding. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 1043–1052, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
ExpertPLM: Pre-training Expert Representation for Expert Finding (Peng & Liu, Findings 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.findings-emnlp.74.pdf