xPQA: Cross-Lingual Product Question Answering in 12 Languages

Xiaoyu Shen, Akari Asai, Bill Byrne, Adria De Gispert


Abstract
Product Question Answering (PQA) systems are key in e-commerce applications as they provide responses to customers’ questions as they shop for products. While existing work on PQA focuses mainly on English, in practice there is need to support multiple customer languages while leveraging product information available in English. To study this practical industrial task, we present xPQA, a large-scale annotated cross-lingual PQA dataset in 12 languages, and report results in (1) candidate ranking, to select the best English candidate containing the information to answer a non-English question; and (2) answer generation, to generate a natural-sounding non-English answer based on the selected English candidate. We evaluate various approaches involving machine translation at runtime or offline, leveraging multilingual pre-trained LMs, and including or excluding xPQA training data. We find that in-domain data is essential as cross-lingual rankers trained on other domains perform poorly on the PQA task, and that translation-based approaches are most effective for candidate ranking while multilingual finetuning works best for answer generation. Still, there remains a significant performance gap between the English and the cross-lingual test sets.
Anthology ID:
2023.acl-industry.12
Volume:
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Sunayana Sitaram, Beata Beigman Klebanov, Jason D Williams
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
103–115
Language:
URL:
https://aclanthology.org/2023.acl-industry.12
DOI:
10.18653/v1/2023.acl-industry.12
Bibkey:
Cite (ACL):
Xiaoyu Shen, Akari Asai, Bill Byrne, and Adria De Gispert. 2023. xPQA: Cross-Lingual Product Question Answering in 12 Languages. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track), pages 103–115, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
xPQA: Cross-Lingual Product Question Answering in 12 Languages (Shen et al., ACL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.acl-industry.12.pdf