ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.17629
  4. Cited By
TransBench: Breaking Barriers for Transferable Graphical User Interface Agents in Dynamic Digital Environments
v1v2 (latest)

TransBench: Breaking Barriers for Transferable Graphical User Interface Agents in Dynamic Digital Environments

23 May 2025
Yuheng Lu
Qian Yu
Hongru Wang
Zeming Liu
Wei Su
Yanping Liu
Yuhang Guo
Maocheng Liang
Yunhong Wang
Haifeng Wang
    LLMAG
ArXiv (abs)PDFHTML

Papers citing "TransBench: Breaking Barriers for Transferable Graphical User Interface Agents in Dynamic Digital Environments"

27 / 27 papers shown
Title
Aria-UI: Visual Grounding for GUI Instructions
Aria-UI: Visual Grounding for GUI Instructions
Yuhao Yang
Yue Wang
Dongxu Li
Ziyang Luo
Bei Chen
Chenyu Huang
Junnan Li
LM&RoLLMAG
166
33
0
20 Dec 2024
GUI Agents with Foundation Models: A Comprehensive Survey
GUI Agents with Foundation Models: A Comprehensive Survey
Shuai Wang
Wen Liu
Jingxuan Chen
Weinan Gan
Xingshan Zeng
...
Bin Wang
Chuhan Wu
Yasheng Wang
Ruiming Tang
Jianye Hao
LLMAG
133
27
0
07 Nov 2024
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents
Zhiyong Wu
Zhenyu Wu
Fangzhi Xu
Yian Wang
Qiushi Sun
...
Kanzhi Cheng
Zichen Ding
Lixing Chen
Paul Pu Liang
Yu Qiao
96
73
0
30 Oct 2024
Aria: An Open Multimodal Native Mixture-of-Experts Model
Aria: An Open Multimodal Native Mixture-of-Experts Model
Dongxu Li
Yudong Liu
Haoning Wu
Yue Wang
Zhiqi Shen
...
Lihuan Zhang
Hanshu Yan
Guoyin Wang
Bei Chen
Junnan Li
MoE
124
65
0
08 Oct 2024
Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents
Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents
Boyu Gou
Ruohan Wang
Boyuan Zheng
Yanan Xie
Cheng Chang
Yiheng Shu
Huan Sun
Yu Su
LM&RoLLMAG
247
96
0
07 Oct 2024
MobileVLM: A Vision-Language Model for Better Intra- and Inter-UI
  Understanding
MobileVLM: A Vision-Language Model for Better Intra- and Inter-UI Understanding
Qinzhuo Wu
Weikai Xu
Wei Liu
Tao Tan
Jianfeng Liu
Ang Li
Jian Luan
Bin Wang
Shuo Shang
VLM
106
17
0
23 Sep 2024
MobileViews: A Large-Scale Mobile GUI Dataset
MobileViews: A Large-Scale Mobile GUI Dataset
Longxi Gao
Li Zhang
Shihe Wang
Shangguang Wang
Yuanchun Li
Mengwei Xu
57
8
0
22 Sep 2024
VisualAgentBench: Towards Large Multimodal Models as Visual Foundation
  Agents
VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents
Xiao-Yang Liu
Tianjie Zhang
Yu Gu
Iat Long Iong
Yifan Xu
...
Zhengxiao Du
Chan Hee Song
Yu Su
Yuxiao Dong
Jie Tang
VLMLLMAG
121
38
0
12 Aug 2024
OmniParser for Pure Vision Based GUI Agent
OmniParser for Pure Vision Based GUI Agent
Yadong Lu
Jianwei Yang
Yelong Shen
Ahmed Hassan Awadallah
MLLM
91
53
0
01 Aug 2024
E-ANT: A Large-Scale Dataset for Efficient Automatic GUI NavigaTion
E-ANT: A Large-Scale Dataset for Efficient Automatic GUI NavigaTion
Ke Wang
Tianyu Xia
Zhangxuan Gu
Yi Zhao
Shuheng Shen
Changhua Meng
Weiqiang Wang
Ke Xu
67
1
0
20 Jun 2024
GUICourse: From General Vision Language Models to Versatile GUI Agents
GUICourse: From General Vision Language Models to Versatile GUI Agents
Wentong Chen
Junbo Cui
Jinyi Hu
Yujia Qin
Junjie Fang
...
Yupeng Huo
Yuan Yao
Yankai Lin
Zhiyuan Liu
Maosong Sun
LLMAG
153
41
0
17 Jun 2024
GUI Odyssey: A Comprehensive Dataset for Cross-App GUI Navigation on
  Mobile Devices
GUI Odyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile Devices
Quanfeng Lu
Wenqi Shao
Zitao Liu
Fanqing Meng
Boxuan Li
Botong Chen
Siyuan Huang
Kaipeng Zhang
Yu Qiao
Ping Luo
118
43
0
12 Jun 2024
Mobile-Agent-v2: Mobile Device Operation Assistant with Effective
  Navigation via Multi-Agent Collaboration
Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration
Junyang Wang
Haiyang Xu
Haitao Jia
Xi Zhang
Ming Yan
Weizhou Shen
Ji Zhang
Fei Huang
Jitao Sang
LM&RoLLMAG
120
75
0
03 Jun 2024
OmniACT: A Dataset and Benchmark for Enabling Multimodal Generalist
  Autonomous Agents for Desktop and Web
OmniACT: A Dataset and Benchmark for Enabling Multimodal Generalist Autonomous Agents for Desktop and Web
Raghav Kapoor
Y. Butala
M. Russak
Jing Yu Koh
Kiran Kamble
Waseem Alshikh
Ruslan Salakhutdinov
LLMAG
138
57
0
27 Feb 2024
CoCo-Agent: A Comprehensive Cognitive MLLM Agent for Smartphone GUI
  Automation
CoCo-Agent: A Comprehensive Cognitive MLLM Agent for Smartphone GUI Automation
Xinbei Ma
Zhuosheng Zhang
Hai Zhao
LLMAG
97
34
0
19 Feb 2024
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Gilles Baechler
Srinivas Sunkara
Maria Wang
Fedir Zubach
Hassan Mansoor
Vincent Etter
Victor Carbune
Jason Lin
Jindong Chen
Abhanshu Sharma
195
59
0
07 Feb 2024
Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool
  Utilization in Real-World Complex Scenarios
Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex Scenarios
Shijue Huang
Wanjun Zhong
Jianqiao Lu
Qi Zhu
Jiahui Gao
...
Yasheng Wang
Lifeng Shang
Xin Jiang
Ruifeng Xu
Qun Liu
LLMAG
76
38
0
30 Jan 2024
SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents
SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents
Kanzhi Cheng
Qiushi Sun
Yougang Chu
Fangzhi Xu
Yantao Li
Jianbing Zhang
Zhiyong Wu
LLMAG
277
189
0
17 Jan 2024
MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile
  Devices
MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile Devices
Xiangxiang Chu
Limeng Qiao
Xinyang Lin
Shuang Xu
Yang Yang
...
Fei Wei
Xinyu Zhang
Bo Zhang
Xiaolin Wei
Chunhua Shen
MLLM
123
44
0
28 Dec 2023
Qwen-VL: A Versatile Vision-Language Model for Understanding,
  Localization, Text Reading, and Beyond
Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond
Jinze Bai
Shuai Bai
Shusheng Yang
Shijie Wang
Sinan Tan
Peng Wang
Junyang Lin
Chang Zhou
Jingren Zhou
MLLMVLMObjD
184
945
0
24 Aug 2023
ToolQA: A Dataset for LLM Question Answering with External Tools
ToolQA: A Dataset for LLM Question Answering with External Tools
Yuchen Zhuang
Yue Yu
Kuan-Chieh Wang
Haotian Sun
Chao Zhang
ELMLLMAG
101
251
0
23 Jun 2023
Mind2Web: Towards a Generalist Agent for the Web
Mind2Web: Towards a Generalist Agent for the Web
Xiang Deng
Yu Gu
Boyuan Zheng
Shijie Chen
Samuel Stevens
Boshi Wang
Huan Sun
Yu-Chuan Su
LLMAG
123
488
0
09 Jun 2023
MidMed: Towards Mixed-Type Dialogues for Medical Consultation
MidMed: Towards Mixed-Type Dialogues for Medical Consultation
Xiaoming Shi
Zeming Liu
Chuan Wang
Haitao Leng
Kui Xue
Xiaofan Zhang
Shaoting Zhang
LM&MAMedIm
72
12
0
05 Jun 2023
Auto-GPT for Online Decision Making: Benchmarks and Additional Opinions
Auto-GPT for Online Decision Making: Benchmarks and Additional Opinions
Hui Yang
Sifu Yue
Yunzhong He
RALM
72
172
0
04 Jun 2023
A Dataset for Interactive Vision-Language Navigation with Unknown
  Command Feasibility
A Dataset for Interactive Vision-Language Navigation with Unknown Command Feasibility
Andrea Burns
Deniz Arsan
Sanjna Agrawal
Ranjitha Kumar
Kate Saenko
Bryan A. Plummer
120
65
0
04 Feb 2022
UIBert: Learning Generic Multimodal Representations for UI Understanding
UIBert: Learning Generic Multimodal Representations for UI Understanding
Chongyang Bai
Xiaoxue Zang
Ying Xu
Srinivas Sunkara
Abhinav Rastogi
Jindong Chen
Blaise Agüera y Arcas
87
95
0
29 Jul 2021
Towards Conversational Recommendation over Multi-Type Dialogs
Towards Conversational Recommendation over Multi-Type Dialogs
Zeming Liu
Haifeng Wang
Zheng-Yu Niu
Hua Wu
Wanxiang Che
Ting Liu
66
204
0
08 May 2020
1