Jiaao Yu
2026
DcLM: Output Length Control of Large Language Models via Dynamic Length Markers
Zhe Chen | Jiaao Yu | Honglin Li
Findings of the Association for Computational Linguistics: ACL 2026
Zhe Chen | Jiaao Yu | Honglin Li
Findings of the Association for Computational Linguistics: ACL 2026
Length-controllable text generation (LCTG) is essential for tasks like text summarization and report generation. However, large language models (LLMs) have limited awareness of output length, so precise control over the length of generated text remains a significant challenge. Most existing methods focus on prompt-based frameworks, position encoding, and reinforcement learning for model training. These approaches may affect semantic quality, and struggle to maintain consistent length control across different models and tasks. In this paper, we propose DcLM, a model-agnostic approach that introduces dynamic length markers to guide length-controllable outputs. During training, the model leverages these markers as in-context information, without learning to generate them. At inference time, an external word counter and injected length information guide the model to produce outputs of accurate lengths. We evaluate our method across multiple datasets, and the experimental results demonstrate that DcLM significantly reduces length deviation, showcasing its robust generalization ability across various length scales and tasks.
2025
Multimedia Event Extraction with LLM Knowledge Editing
Jiaao Yu | Yijing Lin | Zhipeng Gao | Xuesong Qiu | Lanlan Rui
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Jiaao Yu | Yijing Lin | Zhipeng Gao | Xuesong Qiu | Lanlan Rui
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Multimodal event extraction task aims to identify event types and arguments from visual and textual representations related to events. Due to the high cost of multimedia training data, previous methods mainly focused on weakly alignment of excellent unimodal encoders. However, they ignore the conflict between event understanding and image recognition, resulting in redundant feature perception affecting the understanding of multimodal events. In this paper, we propose a multimodal event extraction strategy with a multi-level redundant feature selection mechanism, which enhances the event understanding ability of multimodal large language models by leveraging knowledge editing techniques, and requires no additional parameter optimization work. Extensive experiments show that our method outperforms the state-of-the-art (SOTA) baselines on the M2E2 benchmark. Compared with the highest baseline, we achieve a 34% improvement of precision on event extraction and a 11% improvement of F1 on argument extraction.