Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.23218
Cited By
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents
30 October 2024
Zhiyong Wu
Zhenyu Wu
Fangzhi Xu
Yian Wang
Qiushi Sun
Chengyou Jia
Kanzhi Cheng
Zichen Ding
L. Chen
Paul Pu Liang
Yu Qiao
Re-assign community
ArXiv
PDF
HTML
Papers citing
"OS-ATLAS: A Foundation Action Model for Generalist GUI Agents"
10 / 10 papers shown
Title
Leveraging Vision-Language Models for Visual Grounding and Analysis of Automotive UI
Benjamin Raphael Ernhofer
Daniil Prokhorov
Jannica Langner
Dominik Bollmann
19
0
0
09 May 2025
EcoAgent: An Efficient Edge-Cloud Collaborative Multi-Agent Framework for Mobile Automation
Biao Yi
Xavier Hu
Y. Chen
Shengyu Zhang
Hongxia Yang
Fan Wu
Fei Wu
LLMAG
41
0
0
08 May 2025
ScaleTrack: Scaling and back-tracking Automated GUI Agents
Jing Huang
Zhixiong Zeng
WenKang Han
Yufeng Zhong
Liming Zheng
Shuai Fu
Jingyuan Chen
Lin Ma
36
0
0
01 May 2025
UI-E2I-Synth: Advancing GUI Grounding with Large-Scale Instruction Synthesis
Xinyi Liu
Xiaoyi Zhang
Ziyun Zhang
Yan Lu
21
0
0
15 Apr 2025
GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents
Run Luo
Lu Wang
Wanwei He
Xiaobo Xia
LLMAG
35
5
0
14 Apr 2025
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
Jinguo Zhu
Weiyun Wang
Zhe Chen
Z. Liu
Shenglong Ye
...
D. Lin
Yu Qiao
Jifeng Dai
Wenhai Wang
W. Wang
MLLM
VLM
54
6
1
14 Apr 2025
Kimi-VL Technical Report
Kimi Team
Angang Du
B. Yin
Bowei Xing
Bowen Qu
...
Zhiqi Huang
Zihao Huang
Zijia Zhao
Z. Chen
Zongyu Lin
MLLM
VLM
MoE
88
0
0
10 Apr 2025
A Survey of WebAgents: Towards Next-Generation AI Agents for Web Automation with Large Foundation Models
Liangbo Ning
Ziran Liang
Zhuohang Jiang
Haohao Qu
Yujuan Ding
...
Xiao Wei
Shanru Lin
Hui Liu
Philip S. Yu
Qing Li
LLMAG
LM&Ro
63
5
0
30 Mar 2025
UI-R1: Enhancing Action Prediction of GUI Agents by Reinforcement Learning
Zhengxi Lu
Yuxiang Chai
Yaxuan Guo
Xi Yin
Liang Liu
Hao Wang
Han Xiao
Shuai Ren
Guanjing Xiong
H. Li
LLMAG
LRM
67
9
0
27 Mar 2025
UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction
Shravan Nayak
Xiangru Jian
Kevin Qinghong Lin
Juan A. Rodriguez
Montek Kalsi
...
David Vazquez
Christopher Pal
Perouz Taslakian
Spandana Gella
Sai Rajeswar
76
0
0
19 Mar 2025
1