Yong Pan
2023
CoMave: Contrastive Pre-training with Multi-scale Masking for Attribute Value Extraction
Xinnan Guo
|
Wentao Deng
|
Yongrui Chen
|
Yang Li
|
Mengdi Zhou
|
Guilin Qi
|
Tianxing Wu
|
Dong Yang
|
Liubin Wang
|
Yong Pan
Findings of the Association for Computational Linguistics: ACL 2023
Attribute Value Extraction (AVE) aims to automatically obtain attribute value pairs from product descriptions to aid e-commerce. Despite the progressive performance of existing approaches in e-commerce platforms, they still suffer from two challenges: 1) difficulty in identifying values at different scales simultaneously; 2) easy confusion by some highly similar fine-grained attributes. This paper proposes a pre-training technique for AVE to address these issues. In particular, we first improve the conventional token-level masking strategy, guiding the language model to understand multi-scale values by recovering spans at the phrase and sentence level. Second, we apply clustering to build a challenging negative set for each example and design a pre-training objective based on contrastive learning to force the model to discriminate similar attributes. Comprehensive experiments show that our solution provides a significant improvement over traditional pre-trained models in the AVE task, and achieves state-of-the-art on four benchmarks.
Search
Co-authors
- Xinnan Guo 1
- Wentao Deng 1
- Yongrui Chen 1
- Yang Li 1
- Mengdi Zhou 1
- show all...