ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2307.10088
  4. Cited By
Android in the Wild: A Large-Scale Dataset for Android Device Control

Android in the Wild: A Large-Scale Dataset for Android Device Control

19 July 2023
Christopher Rawles
Alice Li
Daniel Rodriguez
Oriana Riva
Timothy Lillicrap
    LM&Ro
ArXivPDFHTML

Papers citing "Android in the Wild: A Large-Scale Dataset for Android Device Control"

50 / 109 papers shown
Title
Leveraging Vision-Language Models for Visual Grounding and Analysis of Automotive UI
Leveraging Vision-Language Models for Visual Grounding and Analysis of Automotive UI
Benjamin Raphael Ernhofer
Daniil Prokhorov
Jannica Langner
Dominik Bollmann
25
0
0
09 May 2025
Benchmarking Vision, Language, & Action Models in Procedurally Generated, Open Ended Action Environments
Benchmarking Vision, Language, & Action Models in Procedurally Generated, Open Ended Action Environments
Pranav Guruprasad
Yangyue Wang
Sudipta Chowdhury
Harshvardhan Sikka
LM&Ro
VLM
70
0
0
08 May 2025
Towards Efficient Online Tuning of VLM Agents via Counterfactual Soft Reinforcement Learning
Towards Efficient Online Tuning of VLM Agents via Counterfactual Soft Reinforcement Learning
Lang Feng
Weihao Tan
Zhiyi Lyu
Longtao Zheng
Haiyang Xu
M. Yan
Fei Huang
Bo An
20
0
0
01 May 2025
ScaleTrack: Scaling and back-tracking Automated GUI Agents
ScaleTrack: Scaling and back-tracking Automated GUI Agents
Jing Huang
Zhixiong Zeng
WenKang Han
Yufeng Zhong
Liming Zheng
Shuai Fu
Jingyuan Chen
Lin Ma
51
0
0
01 May 2025
AndroidGen: Building an Android Language Agent under Data Scarcity
AndroidGen: Building an Android Language Agent under Data Scarcity
Hanyu Lai
Junjie Gao
Xiao-Yang Liu
Y. Xu
S. Zhang
Yuxiao Dong
Jie Tang
LLMAG
72
0
0
27 Apr 2025
Why and How LLMs Hallucinate: Connecting the Dots with Subsequence Associations
Why and How LLMs Hallucinate: Connecting the Dots with Subsequence Associations
Yiyou Sun
Y. Gai
Lijie Chen
Abhilasha Ravichander
Yejin Choi
D. Song
HILM
54
0
0
17 Apr 2025
Imperative MPC: An End-to-End Self-Supervised Learning with Differentiable MPC for UAV Attitude Control
Imperative MPC: An End-to-End Self-Supervised Learning with Differentiable MPC for UAV Attitude Control
Haonan He
Yuheng Qiu
Junyi Geng
76
0
0
17 Apr 2025
TongUI: Building Generalized GUI Agents by Learning from Multimodal Web Tutorials
TongUI: Building Generalized GUI Agents by Learning from Multimodal Web Tutorials
Bofei Zhang
Zirui Shang
Zhi Gao
Wang Zhang
Rui Xie
Xiaojian Ma
Tao Yuan
Xinxiao Wu
Song-Chun Zhu
Qing Li
LLMAG
35
1
0
17 Apr 2025
ViMo: A Generative Visual GUI World Model for App Agent
ViMo: A Generative Visual GUI World Model for App Agent
Dezhao Luo
Bohan Tang
Kang Li
Georgios Papoudakis
Jifei Song
S. Gong
Jianye Hao
Jun Wang
Kun Shao
LM&Ro
VGen
44
0
0
15 Apr 2025
Breaking the Data Barrier -- Building GUI Agents Through Task Generalization
Breaking the Data Barrier -- Building GUI Agents Through Task Generalization
Junlei Zhang
Zichen Ding
Chang Ma
Zijie Chen
Qiushi Sun
Zhenzhong Lan
Junxian He
45
0
0
14 Apr 2025
Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents
Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents
Saaket Agashe
Kyle Wong
Vincent Tu
Jiachen Yang
Ang Li
Xin Eric Wang
LLMAG
60
1
0
01 Apr 2025
A Survey of WebAgents: Towards Next-Generation AI Agents for Web Automation with Large Foundation Models
A Survey of WebAgents: Towards Next-Generation AI Agents for Web Automation with Large Foundation Models
Liangbo Ning
Ziran Liang
Zhuohang Jiang
Haohao Qu
Yujuan Ding
...
Xiao Wei
Shanru Lin
Hui Liu
Philip S. Yu
Qing Li
LLMAG
LM&Ro
88
5
0
30 Mar 2025
UI-R1: Enhancing Efficient Action Prediction of GUI Agents by Reinforcement Learning
UI-R1: Enhancing Efficient Action Prediction of GUI Agents by Reinforcement Learning
Zhengxi Lu
Yuxiang Chai
Yaxuan Guo
Xi Yin
Liang Liu
Hao Wang
Han Xiao
Shuai Ren
Guanjing Xiong
H. Li
LLMAG
LRM
74
9
0
27 Mar 2025
Boosting Virtual Agent Learning and Reasoning: A Step-wise, Multi-dimensional, and Generalist Reward Model with Benchmark
Boosting Virtual Agent Learning and Reasoning: A Step-wise, Multi-dimensional, and Generalist Reward Model with Benchmark
Bingchen Miao
Y. Wu
Minghe Gao
Qifan Yu
Wendong Bu
Wenqiao Zhang
Yunfei Li
Siliang Tang
Tat-Seng Chua
Juncheng Billy Li
LLMAG
LRM
56
0
0
24 Mar 2025
GUI-Xplore: Empowering Generalizable GUI Agents with One Exploration
GUI-Xplore: Empowering Generalizable GUI Agents with One Exploration
Yuchen Sun
Shanhui Zhao
Tao Yu
Hao Wen
Samith Va
Mengwei Xu
Yuanchun Li
Chongyang Zhang
LLMAG
62
0
0
22 Mar 2025
Does Chain-of-Thought Reasoning Help Mobile GUI Agent? An Empirical Study
Does Chain-of-Thought Reasoning Help Mobile GUI Agent? An Empirical Study
Li Lyna Zhang
Longxi Gao
Mengwei Xu
LRM
37
0
0
21 Mar 2025
Advancing Mobile GUI Agents: A Verifier-Driven Approach to Practical Deployment
Advancing Mobile GUI Agents: A Verifier-Driven Approach to Practical Deployment
Gaole Dai
Shiqi Jiang
Ting Cao
Yuanchun Li
Y. Yang
Rui Tan
Mo Li
Lili Qiu
46
0
0
20 Mar 2025
UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction
UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction
Shravan Nayak
Xiangru Jian
Kevin Qinghong Lin
Juan A. Rodriguez
Montek Kalsi
...
David Vazquez
Christopher Pal
Perouz Taslakian
Spandana Gella
Sai Rajeswar
91
0
0
19 Mar 2025
MP-GUI: Modality Perception with MLLMs for GUI Understanding
MP-GUI: Modality Perception with MLLMs for GUI Understanding
Ziwei Wang
Weizhi Chen
Leyang Yang
Sheng Zhou
Shengchu Zhao
Hanbei Zhan
Jiongchao Jin
Liangcheng Li
Zirui Shao
Jiajun Bu
60
1
0
18 Mar 2025
Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks
Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks
Lutfi Eren Erdogan
Nicholas Lee
Sehoon Kim
Suhong Moon
Hiroki Furuta
Gopala Anumanchipalli
K. K.
Amir Gholami
LLMAG
LM&Ro
AIFin
76
2
0
12 Mar 2025
ReelWave: A Multi-Agent Framework Toward Professional Movie Sound Generation
Zixuan Wang
Chi-Keung Tang
Yu-Wing Tai
DiffM
VGen
58
0
0
10 Mar 2025
FedMABench: Benchmarking Mobile Agents on Decentralized Heterogeneous User Data
Wenhao Wang
Zijie Yu
Rui Ye
J. Zhang
S. Chen
Yanfeng Wang
FedML
45
0
0
07 Mar 2025
SpiritSight Agent: Advanced GUI Agent with One Look
SpiritSight Agent: Advanced GUI Agent with One Look
Zhiyuan Huang
Ziming Cheng
Junting Pan
Zhaohui Hou
Mingjie Zhan
LLMAG
96
2
0
05 Mar 2025
AutoEval: A Practical Framework for Autonomous Evaluation of Mobile Agents
Jiahui Sun
Zhichao Hua
Yubin Xia
45
0
0
04 Mar 2025
Watch Out Your Album! On the Inadvertent Privacy Memorization in Multi-Modal Large Language Models
Tianjie Ju
Yi Hua
Hao Fei
Zhenyu Shao
Yubin Zheng
Haodong Zhao
M. Lee
W. Hsu
Zhuosheng Zhang
Gongshen Liu
43
0
0
03 Mar 2025
Smoothing Grounding and Reasoning for MLLM-Powered GUI Agents with Query-Oriented Pivot Tasks
Zongru Wu
Pengzhou Cheng
Zheng Wu
Tianjie Ju
Zhuosheng Zhang
Gongshen Liu
LRM
32
1
0
01 Mar 2025
Programming with Pixels: Computer-Use Meets Software Engineering
Programming with Pixels: Computer-Use Meets Software Engineering
Pranjal Aggarwal
Sean Welleck
38
0
0
24 Feb 2025
MobileSteward: Integrating Multiple App-Oriented Agents with Self-Evolution to Automate Cross-App Instructions
MobileSteward: Integrating Multiple App-Oriented Agents with Self-Evolution to Automate Cross-App Instructions
Yuxuan Liu
Hongda Sun
Wei Liu
Jian Luan
Bo Du
Rui Yan
48
2
0
24 Feb 2025
DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents
DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents
Taiyi Wang
Zhihao Wu
Jianheng Liu
Jianye Hao
J. Wang
Kun Shao
OffRL
34
13
0
24 Feb 2025
AgentStudio: A Toolkit for Building General Virtual Agents
AgentStudio: A Toolkit for Building General Virtual Agents
Longtao Zheng
Zhiyuan Huang
Zhenghai Xue
Xinrun Wang
Bo An
Shuicheng Yan
75
14
0
17 Feb 2025
Digi-Q: Learning Q-Value Functions for Training Device-Control Agents
Hao Bai
Yifei Zhou
Li Erran Li
Sergey Levine
Aviral Kumar
OffRL
37
1
0
13 Feb 2025
VSC-RL: Advancing Autonomous Vision-Language Agents with Variational Subgoal-Conditioned Reinforcement Learning
VSC-RL: Advancing Autonomous Vision-Language Agents with Variational Subgoal-Conditioned Reinforcement Learning
Qingyuan Wu
Jianheng Liu
Jianye Hao
J. Wang
Kun Shao
OffRL
95
0
0
11 Feb 2025
Towards Internet-Scale Training For Agents
Towards Internet-Scale Training For Agents
Brandon Trabucco
Gunnar A. Sigurdsson
Robinson Piramuthu
Ruslan Salakhutdinov
ALM
98
2
0
10 Feb 2025
AppVLM: A Lightweight Vision Language Model for Online App Control
AppVLM: A Lightweight Vision Language Model for Online App Control
Georgios Papoudakis
Thomas Coste
Zhihao Wu
Jianye Hao
J. Wang
Kun Shao
49
1
0
10 Feb 2025
AutoGUI: Scaling GUI Grounding with Automatic Functionality Annotations from LLMs
AutoGUI: Scaling GUI Grounding with Automatic Functionality Annotations from LLMs
Hongxin Li
Jingfan Chen
Jingran Su
Yuntao Chen
Qing Li
Zhaoxiang Zhang
74
0
0
04 Feb 2025
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning
Zehan Qi
Xiao-Chang Liu
Iat Long Iong
Hanyu Lai
X. Sun
...
Shuntian Yao
Tianjie Zhang
Wei Xu
J. Tang
Yuxiao Dong
93
14
0
28 Jan 2025
Falcon-UI: Understanding GUI Before Following User Instructions
Falcon-UI: Understanding GUI Before Following User Instructions
Huawen Shen
Chang-Shu Liu
Gengluo Li
Xinlong Wang
Yu Zhou
Can Ma
Xiangyang Ji
LLMAG
77
4
0
12 Dec 2024
The BrowserGym Ecosystem for Web Agent Research
The BrowserGym Ecosystem for Web Agent Research
Thibault Le Sellier De Chezelles
Maxime Gasse
Alexandre Lacoste
Alexandre Drouin
Massimo Caccia
...
Siva Reddy
Quentin Cappart
Graham Neubig
Ruslan Salakhutdinov
Nicolas Chapados
LLMAG
96
9
0
06 Dec 2024
ShowUI: One Vision-Language-Action Model for GUI Visual Agent
ShowUI: One Vision-Language-Action Model for GUI Visual Agent
Kevin Qinghong Lin
Linjie Li
Difei Gao
Z. Yang
Shiwei Wu
Zechen Bai
Weixian Lei
Lijuan Wang
Mike Zheng Shou
LLMAG
72
13
0
26 Nov 2024
Generalist Virtual Agents: A Survey on Autonomous Agents Across Digital Platforms
Minghe Gao
Wendong Bu
Bingchen Miao
Yang Wu
Yunfei Li
Juncheng Billy Li
Siliang Tang
Qi Wu
Yueting Zhuang
Meng Wang
LM&Ro
33
3
0
17 Nov 2024
GUI Agents with Foundation Models: A Comprehensive Survey
GUI Agents with Foundation Models: A Comprehensive Survey
Shuai Wang
W. Liu
Jingxuan Chen
Weinan Gan
Xingshan Zeng
...
Bin Wang
Chuhan Wu
Yasheng Wang
Ruiming Tang
Jianye Hao
LLMAG
65
12
0
07 Nov 2024
AndroidLab: Training and Systematic Benchmarking of Android Autonomous
  Agents
AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents
Yifan Xu
Xiao Liu
X. Sun
Siyi Cheng
Hao Yu
Hanyu Lai
Shudan Zhang
Dan Zhang
Jie Tang
Yuxiao Dong
LLMAG
44
7
0
31 Oct 2024
Explainable Behavior Cloning: Teaching Large Language Model Agents
  through Learning by Demonstration
Explainable Behavior Cloning: Teaching Large Language Model Agents through Learning by Demonstration
Yanchu Guan
Dong Wang
Y. Wang
Haiqing Wang
Renen Sun
Chenyi Zhuang
Jinjie Gu
Zhixuan Chu
LM&Ro
LLMAG
25
0
0
30 Oct 2024
AutoGLM: Autonomous Foundation Agents for GUIs
AutoGLM: Autonomous Foundation Agents for GUIs
Xiao Liu
Bo Qin
Dongzhu Liang
Guang Dong
Hanyu Lai
...
Yujia Wang
Y. Xu
Zehan Qi
Yuxiao Dong
Jie Tang
LLMAG
48
11
0
28 Oct 2024
OSCAR: Operating System Control via State-Aware Reasoning and
  Re-Planning
OSCAR: Operating System Control via State-Aware Reasoning and Re-Planning
Xiaoqiang Wang
Bang Liu
LLMAG
LM&Ro
LRM
31
6
0
24 Oct 2024
Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms
Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms
Zhangheng Li
Keen You
H. Zhang
Di Feng
Harsh Agrawal
Xiujun Li
Mohana Prasad Sathya Moorthy
Jeff Nichols
Y. Yang
Zhe Gan
MLLM
48
18
0
24 Oct 2024
Lightweight Neural App Control
Lightweight Neural App Control
Filippos Christianos
Georgios Papoudakis
Thomas Coste
Jianye Hao
Jun Wang
Kun Shao
LM&Ro
47
4
0
23 Oct 2024
Beyond Browsing: API-Based Web Agents
Beyond Browsing: API-Based Web Agents
Yueqi Song
Frank F. Xu
Shuyan Zhou
Graham Neubig
43
13
0
21 Oct 2024
SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation
SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation
Jingxuan Chen
Derek Yuen
Bin Xie
Y. Yang
Gongwei Chen
...
Liqiang Nie
Yasheng Wang
Jianye Hao
Jun Wang
Kun Shao
LLMAG
38
5
0
19 Oct 2024
ClickAgent: Enhancing UI Location Capabilities of Autonomous Agents
ClickAgent: Enhancing UI Location Capabilities of Autonomous Agents
Jakub Hoscilowicz
Bartosz Maj
Bartosz Kozakiewicz
Oleksii Tymoshchuk
Artur Janicki
LLMAG
47
5
0
09 Oct 2024
123
Next