Semantic-conditioned Dual Adaptation for Cross-domain Query-based Visual Segmentation

Ye Wang, Tao Jin, Wang Lin, Xize Cheng, Linjun Li, Zhou Zhao


Abstract
Visual segmentation from language queries has attracted significant research interest. Despite the effectiveness, existing works require expensive labeling and suffer severe degradation when deployed to an unseen domain. In this paper, we investigate a novel task Cross-domain Query-based Visual Segmentation (CQVS), aiming to adapt the segmentation model from a labeled domain to a new unlabeled domain. The challenges of CQVS stem from three domain discrepancies: (1) multi-modal content shift, (2) uni-modal feature gap and (3) cross-modal relation bias. Existing domain adaptation methods fail to address them comprehensively and precisely (e.g. at pixel level), thus being suboptimal for CQVS. To overcome this limitation, we propose Semantic-conditioned Dual Adaptation (SDA), a novel framework to achieve precise feature- and relation-invariant across domains via a universal semantic structure. The SDA consists of two key components: Content-aware Semantic Modeling (CSM) and Dual Adaptive Branches (DAB). First, CSM introduces a common semantic space across domains to provide uniform guidance. Then, DAB seamlessly leverages this semantic information to develop a contrastive feature branch for category-wise pixel alignment, and design a reciprocal relation branch for relation enhancement via two complementary masks. Extensive experiments on three video benchmarks and three image benchmarks evidence the superiority of our approach over the state-of-the-arts.
Anthology ID:
2023.findings-acl.621
Volume:
Findings of the Association for Computational Linguistics: ACL 2023
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
9797–9815
Language:
URL:
https://aclanthology.org/2023.findings-acl.621
DOI:
10.18653/v1/2023.findings-acl.621
Bibkey:
Cite (ACL):
Ye Wang, Tao Jin, Wang Lin, Xize Cheng, Linjun Li, and Zhou Zhao. 2023. Semantic-conditioned Dual Adaptation for Cross-domain Query-based Visual Segmentation. In Findings of the Association for Computational Linguistics: ACL 2023, pages 9797–9815, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Semantic-conditioned Dual Adaptation for Cross-domain Query-based Visual Segmentation (Wang et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-acl.621.pdf