ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.16465
48
0

OS-Kairos: Adaptive Interaction for MLLM-Powered GUI Agents

26 February 2025
Pengzhou Cheng
Zheng Wu
Zongru Wu
Aston Zhang
Zhuosheng Zhang
Gongshen Liu
    LLMAG
ArXivPDFHTML
Abstract

Autonomous graphical user interface (GUI) agents powered by multimodal large language models have shown great promise. However, a critical yet underexplored issue persists: over-execution, where the agent executes tasks in a fully autonomous way, without adequate assessment of its action confidence to compromise an adaptive human-agent collaboration. This poses substantial risks in complex scenarios, such as those involving ambiguous user instructions, unexpected interruptions, and environmental hijacks. To address the issue, we introduce OS-Kairos, an adaptive GUI agent capable of predicting confidence levels at each interaction step and efficiently deciding whether to act autonomously or seek human intervention. OS-Kairos is developed through two key mechanisms: (i) collaborative probing that annotates confidence scores at each interaction step; (ii) confidence-driven interaction that leverages these confidence scores to elicit the ability of adaptive interaction. Experimental results show that OS-Kairos substantially outperforms existing models on our curated dataset featuring complex scenarios, as well as on established benchmarks such as AITZ and Meta-GUI, with 24.59\%∼\sim∼87.29\% improvements in task success rate. OS-Kairos facilitates an adaptive human-agent collaboration, prioritizing effectiveness, generality, scalability, and efficiency for real-world GUI interaction. The dataset and codes are available atthis https URL.

View on arXiv
@article{cheng2025_2503.16465,
  title={ OS-Kairos: Adaptive Interaction for MLLM-Powered GUI Agents },
  author={ Pengzhou Cheng and Zheng Wu and Zongru Wu and Aston Zhang and Zhuosheng Zhang and Gongshen Liu },
  journal={arXiv preprint arXiv:2503.16465},
  year={ 2025 }
}
Comments on this paper