SudokuFill: A Multi-Agent Progressive Filling Framework for Document-Level Scientific Information Extraction

Yang Li; Yajiao Wang; Yu Zhang; Yuanzhe Zhang; Maodi Hu; Mengting Zhang; Xi Sun; Hua Yue; Zhixiong Zhang

SudokuFill: A Multi-Agent Progressive Filling Framework for Document-Level Scientific Information Extraction

Yang Li, Yajiao Wang, Yu Zhang, Yuanzhe Zhang, Maodi Hu, Mengting Zhang, Xi Sun, Hua Yue, Zhixiong Zhang

Abstract

Scientific information extraction (SciIE) is a key bottleneck for turning unstructured papers into computable knowledge bases, yet most existing systems still follow a “local extraction then global assembly” paradigm. This workflow is inherently lossy: by extracting fields in isolation, it breaks global correlations and discards high-confidence signals that could otherwise be reused as internal supervision, forcing systems to repeatedly restart from scratch, especially in long, multimodal scientific documents. In this paper, We propose a different view: SciIE should be solved as a progressive filling problem, similar to solving a Sudoku,once a field is filled with high confidence, it should act as a constraint that guides the remaining uncertain fields. Based on this idea, we introduce SudokuFill, a multi-agent framework that maintains a Global Filling State and performs priority scheduling to establish reliable anchors first, then reuses them as internal supervision for iterative deliberation over harder fields. Evaluated on a specialized document-level adjuvant dataset, our framework achieves a SOTA score of 51.83% on our benchmark. Crucially, SudokuFill enables a 7B model to outperform the vanilla GPT-4o, proving that structured architectural reasoning can effectively compensate for parameter scale.

Anthology ID:: 2026.findings-acl.1657
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 33112–33138
Language:
URL:: https://aclanthology.org/2026.findings-acl.1657/
DOI:
Bibkey:
Cite (ACL):: Yang Li, Yajiao Wang, Yu Zhang, Yuanzhe Zhang, Maodi Hu, Mengting Zhang, Xi Sun, Hua Yue, and Zhixiong Zhang. 2026. SudokuFill: A Multi-Agent Progressive Filling Framework for Document-Level Scientific Information Extraction. In Findings of the Association for Computational Linguistics: ACL 2026, pages 33112–33138, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: SudokuFill: A Multi-Agent Progressive Filling Framework for Document-Level Scientific Information Extraction (Li et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.1657.pdf
Checklist:: 2026.findings-acl.1657.checklist.pdf

PDF Cite Search Checklist Fix data