Causal Inference for Human-Language Model Collaboration

Bohan Zhang, Yixin Wang, Paramveer Dhillon


Abstract
In this paper, we examine the collaborative dynamics between humansand language models (LMs), where the interactions typically involveLMs proposing text segments and humans editing or responding to theseproposals. Productive engagement with LMs in such scenarios necessitates that humans discern effective text-based interaction strategies, such as editing and response styles, from historical human-LM interactions. This objective is inherently causal, driven by the counterfactual ‘what-if’ question: how would the outcome of collaboration change if humans employed a different text editing/refinement strategy? A key challenge in answering this causal inference question is formulating an appropriate causal estimand: the conventional average treatment effect (ATE) estimand is inapplicable to text-based treatments due to their high dimensionality. To address this concern, we introduce a new causal estimand– *Incremental Stylistic Effect (ISE)*, which characterizes the average impact of infinitesimally shifting a text towards a specific style, such as increasing formality. We establish the conditions for the non-parametric identification of ISE. Building on this, we develop *CausalCollab*, an algorithm designed to estimate the ISE of various interaction strategies in dynamic human-LM collaborations. Our empirical investigations across three distinct human-LM collaboration scenarios reveal that *CausalCollab* effectively reduces confounding and significantly improves counterfactual estimation over a set of competitive baselines.
Anthology ID:
2024.naacl-long.91
Volume:
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Kevin Duh, Helena Gomez, Steven Bethard
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1630–1647
Language:
URL:
https://aclanthology.org/2024.naacl-long.91
DOI:
10.18653/v1/2024.naacl-long.91
Bibkey:
Cite (ACL):
Bohan Zhang, Yixin Wang, and Paramveer Dhillon. 2024. Causal Inference for Human-Language Model Collaboration. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 1630–1647, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
Causal Inference for Human-Language Model Collaboration (Zhang et al., NAACL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.naacl-long.91.pdf