ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.07939
  4. Cited By
UFO: A UI-Focused Agent for Windows OS Interaction
v1v2v3v4v5 (latest)

UFO: A UI-Focused Agent for Windows OS Interaction

8 February 2024
Chaoyun Zhang
Liqun Li
Shilin He
Xu Zhang
Bo Qiao
Si Qin
Ming-Jie Ma
Yu Kang
Qingwei Lin
Saravan Rajmohan
Dongmei Zhang
Qi Zhang
    LLMAG
ArXiv (abs)PDFHTMLHuggingFace (17 upvotes)Github (7307★)

Papers citing "UFO: A UI-Focused Agent for Windows OS Interaction"

50 / 83 papers shown
GUI-AIMA: Aligning Intrinsic Multimodal Attention with a Context Anchor for GUI Grounding
GUI-AIMA: Aligning Intrinsic Multimodal Attention with a Context Anchor for GUI Grounding
Shijie Zhou
Viet Dac Lai
Hao Tan
Jihyung Kil
Wanrong Zhu
Changyou Chen
Ruiyi Zhang
230
2
0
30 Mar 2026
Prune4Web: DOM Tree Pruning Programming for Web Agent
Prune4Web: DOM Tree Pruning Programming for Web Agent
J. Zhang
Kaiquan Chen
Zhihao Lu
Enshen Zhou
Qian Yu
Jing Zhang
452
3
0
26 Nov 2025
WebCoach: Self-Evolving Web Agents with Cross-Session Memory Guidance
WebCoach: Self-Evolving Web Agents with Cross-Session Memory Guidance
Genglin Liu
Shijie Geng
Sha Li
Hejie Cui
Sarah Zhang
Xin Liu
Tianyi Liu
CLL
787
4
0
17 Nov 2025
An Efficient Training Pipeline for Reasoning Graphical User Interface Agents
An Efficient Training Pipeline for Reasoning Graphical User Interface Agents
Georgios Pantazopoulos
Eda B. Özyiğit
LRM
441
0
0
11 Nov 2025
Empowering Real-World: A Survey on the Technology, Practice, and Evaluation of LLM-driven Industry Agents
Empowering Real-World: A Survey on the Technology, Practice, and Evaluation of LLM-driven Industry Agents
Yihong Tang
Kehai Chen
Liang Yue
Jinxin Fan
Caishen Zhou
...
Kaiyang Guo
Xingshan Zeng
Wenjing Cun
L. Shang
Min Zhang
LLMAG
214
1
0
20 Oct 2025
SAG-Agent: Enabling Long-Horizon Reasoning in Strategy Games via Dynamic Knowledge Graphs
SAG-Agent: Enabling Long-Horizon Reasoning in Strategy Games via Dynamic Knowledge Graphs
Chenwei Tang
Jingyu Xing
Xinyu Liu
Zizhou Wang
Jiawei Du
Liangli Zhen
Jiancheng Lv
Liangli Zhen
Jiancheng Lv
LRM
236
0
0
17 Oct 2025
CORE: Reducing UI Exposure in Mobile Agents via Collaboration Between Cloud and Local LLMs
CORE: Reducing UI Exposure in Mobile Agents via Collaboration Between Cloud and Local LLMs
Gucongcong Fan
Chaoyue Niu
Chengfei Lyu
Fan Wu
Guihai Chen
176
5
0
17 Oct 2025
OpenDerisk: An Industrial Framework for AI-Driven SRE, with Design, Implementation, and Case Studies
OpenDerisk: An Industrial Framework for AI-Driven SRE, with Design, Implementation, and Case Studies
Peng Di
Faqiang Chen
X. Bai
Hongjun Yang
Qingfeng Li
...
Zhitao Shen
Zheng Li
Wenhui Shi
Junwei Guo
Hang Yu
238
0
0
15 Oct 2025
vAttention: Verified Sparse Attention
vAttention: Verified Sparse Attention
Aditya Desai
Kumar Krishna Agrawal
Shuo Yang
Alejandro Cuadron
Luis Gaspar Schroeder
Matei A. Zaharia
Joseph E. Gonzalez
Ion Stoica
VLM
155
0
0
07 Oct 2025
From Imperative to Declarative: Towards LLM-friendly OS Interfaces for Boosted Computer-Use Agents
From Imperative to Declarative: Towards LLM-friendly OS Interfaces for Boosted Computer-Use Agents
Yuan Wang
Mingyu Li
Haibo Chen
LMTDALMELM
250
0
0
06 Oct 2025
LEGOMem: Modular Procedural Memory for Multi-agent LLM Systems for Workflow Automation
LEGOMem: Modular Procedural Memory for Multi-agent LLM Systems for Workflow Automation
Dongge Han
Camille Couturier
Daniel Madrigal Diaz
Xuchao Zhang
Victor Rühle
Saravan Rajmohan
155
10
0
06 Oct 2025
Improving GUI Grounding with Explicit Position-to-Coordinate Mapping
Improving GUI Grounding with Explicit Position-to-Coordinate Mapping
Suyuchen Wang
Tianyu Zhang
Ahmed Masry
Christopher Pal
Spandana Gella
Bang Liu
Perouz Taslakian
124
2
0
03 Oct 2025
Agent-ScanKit: Unraveling Memory and Reasoning of Multimodal Agents via Sensitivity Perturbations
Agent-ScanKit: Unraveling Memory and Reasoning of Multimodal Agents via Sensitivity Perturbations
Pengzhou Cheng
Lingzhong Dong
Zeng Wu
Zongru Wu
Zhuosheng Zhang
Chengwei Qin
Zhuosheng Zhang
Gongshen Liu
LLMAG
446
2
0
01 Oct 2025
Learning GUI Grounding with Spatial Reasoning from Visual Feedback
Learning GUI Grounding with Spatial Reasoning from Visual Feedback
Yu Zhao
Wei Chen
Huseyin A. Inan
Samuel Kessler
Lu Wang
...
Fangkai Yang
Chaoyun Zhang
Pasquale Minervini
Saravan Rajmohan
Robert Sim
158
3
0
25 Sep 2025
Towards Understanding Visual Grounding in Visual Language Models
Towards Understanding Visual Grounding in Visual Language Models
Georgios Pantazopoulos
Eda B. Özyiğit
ObjD
507
4
0
12 Sep 2025
Instruction Agent: Enhancing Agent with Expert Demonstration
Instruction Agent: Enhancing Agent with Expert Demonstration
Yinheng Li
Hailey Hultquist
Justin Wagle
K. Koishida
LLMAG
143
0
0
08 Sep 2025
Mobile-Agent-v3: Fundamental Agents for GUI Automation
Mobile-Agent-v3: Fundamental Agents for GUI Automation
Jiabo Ye
Xi Zhang
Haiyang Xu
Haowei Liu
Junyang Wang
...
Jitong Liao
Qi Zheng
Fei Huang
Jingren Zhou
Ming Yan
LLMAGLM&Ro
356
85
0
21 Aug 2025
MagicGUI: A Foundational Mobile GUI Agent with Scalable Data Pipeline and Reinforcement Fine-tuning
MagicGUI: A Foundational Mobile GUI Agent with Scalable Data Pipeline and Reinforcement Fine-tuning
Liujian Tang
Shaokang Dong
Y. Huang
Minqi Xiang
Hongtao Ruan
...
Qi Zhang
Kang Wang
Y. Zhang
Y. Wang
Yuran Wang
LM&Ro
526
13
0
19 Jul 2025
Atomic-to-Compositional Generalization for Mobile Agents with A New Benchmark and Scheduling System
Yuan Guo
Tingjia Miao
Zheng Wu
Pengzhou Cheng
Ming Zhou
Zhuosheng Zhang
354
7
0
10 Jun 2025
Look Before You Leap: A GUI-Critic-R1 Model for Pre-Operative Error Diagnosis in GUI Automation
Look Before You Leap: A GUI-Critic-R1 Model for Pre-Operative Error Diagnosis in GUI Automation
Yuyang Wanyan
Xi Zhang
Haiyang Xu
Haowei Liu
Junyang Wang
...
Ming Yan
Fei Huang
Xiaoshan Yang
Weiming Dong
Changsheng Xu
LLMAGLRM
450
16
0
05 Jun 2025
macOSWorld: A Multilingual Interactive Benchmark for GUI Agents
macOSWorld: A Multilingual Interactive Benchmark for GUI Agents
Pei Yang
Hai Ci
Mike Zheng Shou
LLMAG
623
7
0
04 Jun 2025
GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents
GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents
Qianhui Wu
Kanzhi Cheng
Rui Yang
Chaoyun Zhang
Jianwei Yang
...
Huan Zhang
Tong Zhang
Jianbing Zhang
Dongmei Zhang
J. Gao
LM&Ro
362
49
0
03 Jun 2025
Text2Grad: Reinforcement Learning from Natural Language Feedback
Text2Grad: Reinforcement Learning from Natural Language Feedback
Hanyang Wang
Lu Wang
Chaoyun Zhang
Tianjun Mao
Si Qin
Qingwei Lin
Saravan Rajmohan
Dongmei Zhang
333
10
0
28 May 2025
ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows
ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows
Qiushi Sun
Zhoumianze Liu
Chang Ma
Zichen Ding
Fangzhi Xu
...
B. Kao
Wenhai Wang
Biqing Qi
Lingpeng Kong
Zhiyong Wu
LLMAGLM&Ro
574
18
0
26 May 2025
ScreenExplorer: Training a Vision-Language Model for Diverse Exploration in Open GUI World
ScreenExplorer: Training a Vision-Language Model for Diverse Exploration in Open GUI World
Runliang Niu
Jinglong Ji
Yi Chang
Zhiqiang Zhang
243
1
0
25 May 2025
LA-RCS: LLM-Agent-Based Robot Control System
LA-RCS: LLM-Agent-Based Robot Control System
TaekHyun Park
YoungJun Choi
SeungHoon Shin
Kwangil Lee
341
3
0
23 May 2025
ProgRM: Build Better GUI Agents with Progress Rewards
ProgRM: Build Better GUI Agents with Progress Rewards
Danyang Zhang
Situo Zhang
Ziyue Yang
Zichen Zhu
Zihan Zhao
Ruisheng Cao
Lu Chen
Kai Yu
294
10
0
23 May 2025
Hidden Ghost Hand: Unveiling Backdoor Vulnerabilities in MLLM-Powered Mobile GUI Agents
Hidden Ghost Hand: Unveiling Backdoor Vulnerabilities in MLLM-Powered Mobile GUI Agents
Pengzhou Cheng
Haowen Hu
Zheng Wu
Zongru Wu
Tianjie Ju
Zhuosheng Zhang
Zhuosheng Zhang
LLMAGAAML
446
8
0
20 May 2025
Mobile-Agent-V: A Video-Guided Approach for Effortless and Efficient Operational Knowledge Injection in Mobile Automation
Mobile-Agent-V: A Video-Guided Approach for Effortless and Efficient Operational Knowledge Injection in Mobile Automation
Junyang Wang
Haiyang Xu
Xi Zhang
Ming Yan
Ji Zhang
Fei Huang
Jitao Sang
559
0
0
20 May 2025
From Assistants to Adversaries: Exploring the Security Risks of Mobile LLM Agents
From Assistants to Adversaries: Exploring the Security Risks of Mobile LLM Agents
Liangxuan Wu
Chao Wang
Tianming Liu
Yanjie Zhao
Haoyu Wang
AAML
495
16
0
19 May 2025
Can Global XAI Methods Reveal Injected Bias in LLMs? SHAP vs Rule Extraction vs RuleSHAP
Can Global XAI Methods Reveal Injected Bias in LLMs? SHAP vs Rule Extraction vs RuleSHAP
Francesco Sovrano
682
167
0
16 May 2025
Towards Efficient Online Tuning of VLM Agents via Counterfactual Soft Reinforcement Learning
Towards Efficient Online Tuning of VLM Agents via Counterfactual Soft Reinforcement Learning
Lang Feng
Weihao Tan
Zhiyi Lyu
Longtao Zheng
Haiyang Xu
Ming Yan
Fei Huang
Jingyi Wang
406
7
0
01 May 2025
Toward a Human-Centered Evaluation Framework for Trustworthy LLM-Powered GUI Agents
Toward a Human-Centered Evaluation Framework for Trustworthy LLM-Powered GUI Agents
Chong Chen
Zhiping Zhang
Ibrahim Khalilov
Bingcan Guo
Simret Araya Gebreegziabher
Yanfang Ye
Ziang Xiao
Yaxing Yao
Tianshi Li
T. Li
LLMAGELM
509
12
0
24 Apr 2025
UFO2: The Desktop AgentOS
UFO2: The Desktop AgentOS
Chaoyun Zhang
He Huang
Chiming Ni
J. Mu
Si Qin
...
Minghua Ma
Jian-Guang Lou
Qingwei Lin
Saravan Rajmohan
Dongmei Zhang
LLMAG
835
26
0
20 Apr 2025
TongUI: Internet-Scale Trajectories from Multimodal Web Tutorials for Generalized GUI Agents
TongUI: Internet-Scale Trajectories from Multimodal Web Tutorials for Generalized GUI Agents
Bofei Zhang
Zirui Shang
Zhi Gao
Wang Zhang
Rui Xie
Xiaojian Ma
Tao Yuan
Xinxiao Wu
Song-Chun Zhu
Qing Li
LLMAG
580
21
0
17 Apr 2025
The Obvious Invisible Threat: LLM-Powered GUI Agents' Vulnerability to Fine-Print Injections
The Obvious Invisible Threat: LLM-Powered GUI Agents' Vulnerability to Fine-Print Injections
Chong Chen
Zhiping Zhang
Bingcan Guo
Shang Ma
Ibrahim Khalilov
...
Yanfang Ye
Ziang Xiao
Yaxing Yao
Tianshi Li
Tao Li
AAMLLLMAGSILM
490
45
0
15 Apr 2025
ScreenSpot-Pro: GUI Grounding for Professional High-Resolution Computer Use
ScreenSpot-Pro: GUI Grounding for Professional High-Resolution Computer Use
Kaixin Li
Ziyang Meng
Hongzhan Lin
Ziyang Luo
Yuchen Tian
Jing Ma
Zhiyong Huang
Tat-Seng Chua
445
157
0
04 Apr 2025
A Survey of WebAgents: Towards Next-Generation AI Agents for Web Automation with Large Foundation Models
A Survey of WebAgents: Towards Next-Generation AI Agents for Web Automation with Large Foundation Models
Liangbo Ning
Ziran Liang
Zhuohang Jiang
Haohao Qu
Yujuan Ding
...
Xiao Wei
Shanru Lin
Hui Liu
Philip S. Yu
Qing Li
LLMAGLM&Ro
837
78
0
30 Mar 2025
ScreenLLM: Stateful Screen Schema for Efficient Action Understanding and Prediction
ScreenLLM: Stateful Screen Schema for Efficient Action Understanding and PredictionThe Web Conference (WWW), 2025
Yiqiao Jin
Stefano Petrangeli
Yu Shen
Gang Wu
LLMAGLM&Ro
1.0K
3
0
26 Mar 2025
Boosting Virtual Agent Learning and Reasoning: A Step-Wise, Multi-Dimensional, and Generalist Reward Model with Benchmark
Boosting Virtual Agent Learning and Reasoning: A Step-Wise, Multi-Dimensional, and Generalist Reward Model with Benchmark
Bingchen Miao
Y. Wu
Minghe Gao
Qifan Yu
Wendong Bu
Wenqiao Zhang
Yunfei Li
Siliang Tang
Tat-Seng Chua
Juncheng Billy Li
LLMAGLRM
489
6
0
24 Mar 2025
API Agents vs. GUI Agents: Divergence and Convergence
API Agents vs. GUI Agents: Divergence and Convergence
Chaoyun Zhang
Shilin He
Liqun Li
Si Qin
Yu Kang
Qingwei Lin
Saravan Rajmohan
Dongmei Zhang
LLMAG
546
23
0
14 Mar 2025
CHOP: Mobile Operating Assistant with Constrained High-frequency Optimized Subtask Planning
CHOP: Mobile Operating Assistant with Constrained High-frequency Optimized Subtask Planning
Yuqi Zhou
Shuai Wang
Sunhao Dai
Qinglin Jia
Zhaocheng Du
Zhenhua Dong
Jun Xu
LM&Ro
362
5
0
05 Mar 2025
AppAgentX: Evolving GUI Agents as Proficient Smartphone Users
AppAgentX: Evolving GUI Agents as Proficient Smartphone Users
Wenjia Jiang
Yangyang Zhuang
Chenxi Song
Xu Yang
Chi Zhang
Chi Zhang
LLMAG
601
36
0
04 Mar 2025
Smoothing Grounding and Reasoning for MLLM-Powered GUI Agents with Query-Oriented Pivot Tasks
Smoothing Grounding and Reasoning for MLLM-Powered GUI Agents with Query-Oriented Pivot Tasks
Zongru Wu
Pengzhou Cheng
Zheng Wu
Tianjie Ju
Zhuosheng Zhang
Gongshen Liu
LRM
427
8
0
01 Mar 2025
Nexus: A Lightweight and Scalable Multi-Agent Framework for Complex Tasks Automation
Nexus: A Lightweight and Scalable Multi-Agent Framework for Complex Tasks Automation
Humza Sami
Mubashir ul Islam
Samy Charas
Asav Gandhi
P. Gaillardon
V. Tenace
LLMAG
356
8
0
26 Feb 2025
VEM: Environment-Free Exploration for Training GUI Agent with Value Environment Model
VEM: Environment-Free Exploration for Training GUI Agent with Value Environment Model
Jiani Zheng
Lu Wang
Fangkai Yang
Chen Zhang
Shansong Liu
Wenjie Yin
Qingwei Lin
Dongmei Zhang
Saravan Rajmohan
Qi Zhang
OffRL
420
15
0
26 Feb 2025
AgentStudio: A Toolkit for Building General Virtual Agents
AgentStudio: A Toolkit for Building General Virtual AgentsInternational Conference on Learning Representations (ICLR), 2024
Longtao Zheng
Zhiyuan Huang
Zhenghai Xue
Xinrun Wang
Bo An
Shuicheng Yan
566
39
0
17 Feb 2025
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning
Zehan Qi
Xiao-Chang Liu
Iat Long Iong
Hanyu Lai
Xingwu Sun
...
Shuntian Yao
Tianjie Zhang
Wei Xu
J. Tang
Yuxiao Dong
661
149
0
28 Jan 2025
Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks
Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks
Zhenhailong Wang
Haiyang Xu
Junyang Wang
Xi Zhang
Ming Yan
Junxuan Zhang
Fei Huang
Heng Ji
616
97
0
20 Jan 2025
Aria-UI: Visual Grounding for GUI Instructions
Aria-UI: Visual Grounding for GUI InstructionsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Yuhao Yang
Yue Wang
Dongxu Li
Ziyang Luo
Bei Chen
Chenyu Huang
Junnan Li
LM&RoLLMAG
647
109
0
20 Dec 2024
12
Next
Page 1 of 2