Stronger, Lighter, Better: Towards Life-Long Attribute Value Extraction for E-Commerce Products

Tao Zhang, Chenwei Zhang, Xian Li, Jingbo Shang, Hoang Nguyen, Philip Yu


Abstract
Attribute value extraction involves identifying the value spans of predetermined attributes in product texts. This area of research has traditionally operated under a closed-world assumption, focusing on products from a static set of categories and their associated attributes. However, products in e-commerce stores are ever-increasing and evolving, calling for life-long learning. If continuously trained on the fast-increasing products and attributes, most existing solutions not only struggle for parameter efficiency but also endure foreseeable defects due to data contamination, catastrophic forgetting, etc. As a remedy, we propose and study a new task, which aims to effectively maintain a strong single model for many domains in a life-long learning fashion, without jeopardizing the model performance and parameter efficiency. We introduce factorization into the model and make it domain-aware by decoupling the modeling of product type and attribute, as a way to promote de-contamination and parameter efficiency while scaling up. Tuning the model with distillation prevents forgetting historical knowledge and enables continuous learning from emerging domains. Experiments on hundreds of domains showed that our model attains the near state-of-the-art performance with affordable parameter size, the least historical knowledge forgetting, and the greatest robustness against noises, whilst adding only a few parameters per domain when compared with competitive baselines.
Anthology ID:
2024.findings-acl.510
Volume:
Findings of the Association for Computational Linguistics: ACL 2024
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
8631–8643
Language:
URL:
https://aclanthology.org/2024.findings-acl.510
DOI:
10.18653/v1/2024.findings-acl.510
Bibkey:
Cite (ACL):
Tao Zhang, Chenwei Zhang, Xian Li, Jingbo Shang, Hoang Nguyen, and Philip Yu. 2024. Stronger, Lighter, Better: Towards Life-Long Attribute Value Extraction for E-Commerce Products. In Findings of the Association for Computational Linguistics: ACL 2024, pages 8631–8643, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
Stronger, Lighter, Better: Towards Life-Long Attribute Value Extraction for E-Commerce Products (Zhang et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-acl.510.pdf