OpenPhone: Mobile Agentic Foundation Models

Yangqin Jiang; Chao Huang

OpenPhone: Mobile Agentic Foundation Models

Abstract

With the advancement of multimodal large language models (MLLMs), building GUI agent systems has become an increasingly promising direction—especially for mobile platforms, given their rich app ecosystems and intuitive touch interactions. Yet mobile GUI agents face a critical dilemma: truly on-device models (4B or smaller) lack sufficient performance, while capable models (starting from 7B) are either too large for mobile deployment or prohibitively costly (e.g., cloud-only closed-source MLLMs). To resolve this, we propose OpenPhone, a mobile GUI agent system that leverages device-cloud collaboration to tap the cost-efficiency of on-device models and the high capability of cloud models, while avoiding their drawbacks. Specifically, OpenPhone enhances Qwen2.5-VL-3B via two-stage SFT→GRPO training on synthetic GUI data for strong decision-making, integrates an efficient long-reasoning mechanism to utilize historical interactions under tight resources, and defaults to on-device execution—only escalating challenging subtasks to the cloud via real-time complexity assessment. Experiments on the online AndroidLab benchmark and diverse apps show OpenPhone matches or nears larger models, with a significant reduction in cloud costs.

Anthology ID:: 2026.findings-acl.1518
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 30362–30380
Language:
URL:: https://aclanthology.org/2026.findings-acl.1518/
DOI:
Bibkey:
Cite (ACL):: Yangqin Jiang and Chao Huang. 2026. OpenPhone: Mobile Agentic Foundation Models. In Findings of the Association for Computational Linguistics: ACL 2026, pages 30362–30380, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: OpenPhone: Mobile Agentic Foundation Models (Jiang & Huang, Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.1518.pdf
Checklist:: 2026.findings-acl.1518.checklist.pdf

PDF Cite Search Checklist Fix data