Ke Chao

2026

At Your Own PACE: A Causal Framework for Evaluating EQ in LLMs
Lei Lyu | Shengling Wang | Ke Chao | Yichao Wei
Findings of the Association for Computational Linguistics: ACL 2026

Emotional Quotient (EQ) has emerged as a competency for seamless human-AI integration. However, since traditional EQ scales focus on self-healing, directly migrating them to Large Language Models (LLMs) often leads to ignorance of healing others. While EQ metrics specifically designed for LLMs have been proposed, they remain mired in two dilemmas: dimensional deficiency and fragmented testing. Hence, this paper establishes a Quad-in-One architecture for a closed-loop EQ evaluation. First, we propose the PACE Taxonomy to define four dimensions of LLM EQ. Upon this, the Causal-PACE framework is developed to eliminate causal confounding bias triggered by the interactions among EQ dimensions, ensuring a rigorous quantification of composite EQ scores. To operationalize this framework, we implement the PACE-AB, a mutil-agent EQevaluation board system. Finally, we curate the PACE-2700 dataset, featuring 2,700 high-quality instructions, to serve as a comprehensive benchmark for large-scale validation.Experimental results demonstrate that the EQ values derived via the Causal-PACE achieve a high alignment of 89.31% with human preferences, while the automated PACE-AB system maintains a robust consistency of 83.6%. Our data is publicly available at https://anonymous.4open.science/r/PACE-2700-8E52.

pdf bib abs

Bridging Internal Consistency and External Alignment: A Causal and Dynamic Interpretability Framework for LLM Generation
Shuyao Xiao | Shengling Wang | Ke Chao
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Large Language Models (LLMs) are widely used in high-stakes applications, making their interpretability increasingly important. Existing interpretability methods are typically categorized into internal and external perspectives, which are often studied in isolation and tend to overlook two key aspects: causality and temporal dynamics. Explanations are often limited to surface correlations or static dependencies, failing to capture how influences evolve during autoregressive generation. To address these limitations, we propose a causal and dynamic interpretability framework for LLM generation. We first characterize the backdoor-adjusted causal effects of both the generated prefix and the prompt on the current token using the Structural Causal Model. Next, we introduce two metrics to quantify contextual causal influence and question–answer causal influence. Overall, our work provides a unified causal view of internal consistency and external alignment in LLM generation dynamics.

Co-authors

Venues

ACL1
Findings1

Fix author