ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1812.09195
  4. Cited By
Learning to Navigate the Web

Learning to Navigate the Web

21 December 2018
Izzeddin Gur
U. Rückert
Aleksandra Faust
Dilek Z. Hakkani-Tür
ArXiv (abs)PDFHTML

Papers citing "Learning to Navigate the Web"

48 / 48 papers shown
Title
ALLOY: Generating Reusable Agent Workflows from User Demonstration
ALLOY: Generating Reusable Agent Workflows from User Demonstration
Jiawen Li
Zheng Ning
Yuan Tian
Toby Jia-Jun Li
LLMAG
100
0
0
11 Oct 2025
TextOnly: A Unified Function Portal for Text-Related Functions on Smartphones
TextOnly: A Unified Function Portal for Text-Related Functions on Smartphones
Minghao Tu
Chun Yu
Xiyuan Shen
Zhi Zheng
Li Chen
Yuanchun Shi
88
0
0
23 Aug 2025
OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use
OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use
Xueyu Hu
Tao Xiong
Biao Yi
Zishu Wei
Ruixuan Xiao
...
Zhou Zhao
Hongxia Yang
Fan Wu
Shengyu Zhang
Fei Wu
LLMAGLM&RoAI4TS
230
29
0
06 Aug 2025
AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories
AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories
Xing Han Lù
Amirhossein Kazemnejad
Nicholas Meade
Arkil Patel
Dongchan Shin
Alejandra Zambrano
Karolina Stañczak
Peter Shaw
Christopher Pal
Siva Reddy
LLMAG
343
17
0
11 Apr 2025
Inducing Programmatic Skills for Agentic Tasks
Inducing Programmatic Skills for Agentic Tasks
Zora Z. Wang
Apurva Gandhi
Graham Neubig
Daniel Fried
LLMAG
375
19
0
09 Apr 2025
A2Perf: Real-World Autonomous Agents Benchmark
Ikechukwu Uchendu
Jason J. Jabbour
Korneel Van den Berghe
Joel Runevic
Matthew P. Stewart
...
S. Guadarrama
Jie Tan
Jordan K. Terry
Aleksandra Faust
Vijay Janapa Reddi
249
1
0
04 Mar 2025
AgentStudio: A Toolkit for Building General Virtual Agents
AgentStudio: A Toolkit for Building General Virtual AgentsInternational Conference on Learning Representations (ICLR), 2024
Longtao Zheng
Zhiyuan Huang
Zhenghai Xue
Xinrun Wang
Bo An
Shuicheng Yan
436
34
0
17 Feb 2025
RWKV-UI: UI Understanding with Enhanced Perception and Reasoning
RWKV-UI: UI Understanding with Enhanced Perception and Reasoning
Jiaxi Yang
Haowen Hou
ReLMLRM
123
0
0
06 Feb 2025
Exposing Limitations of Language Model Agents in Sequential-Task Compositions on the Web
Exposing Limitations of Language Model Agents in Sequential-Task Compositions on the Web
Hiroki Furuta
Yutaka Matsuo
Aleksandra Faust
Izzeddin Gur
CLL
591
19
0
03 Jan 2025
The BrowserGym Ecosystem for Web Agent Research
The BrowserGym Ecosystem for Web Agent Research
Thibault Le Sellier De Chezelles
Maxime Gasse
Alexandre Lacoste
Alexandre Drouin
Massimo Caccia
...
Siva Reddy
Quentin Cappart
Graham Neubig
Ruslan Salakhutdinov
Nicolas Chapados
LLMAG
1.9K
62
0
06 Dec 2024
GUI Agents with Foundation Models: A Comprehensive Survey
GUI Agents with Foundation Models: A Comprehensive Survey
Shuai Wang
Wen Liu
Jingxuan Chen
Weinan Gan
Xingshan Zeng
...
Bin Wang
Chuhan Wu
Yasheng Wang
Ruiming Tang
Jianye Hao
LLMAG
458
70
0
07 Nov 2024
EDGE: Enhanced Grounded GUI Understanding with Enriched
  Multi-Granularity Synthetic Data
EDGE: Enhanced Grounded GUI Understanding with Enriched Multi-Granularity Synthetic Data
Xuetian Chen
Hangcheng Li
Jiaqing Liang
Sihang Jiang
Deqing Yang
LLMAG
444
7
0
25 Oct 2024
From Interaction to Impact: Towards Safer AI Agents Through
  Understanding and Evaluating UI Operation Impacts
From Interaction to Impact: Towards Safer AI Agents Through Understanding and Evaluating UI Operation ImpactsInternational Conference on Intelligent User Interfaces (IUI), 2024
Zhuohao Jerry Zhang
E. Schoop
Jeffrey Nichols
Anuj Mahajan
Amanda Swearngin
LLMAG
264
1
0
11 Oct 2024
TinyClick: Single-Turn Agent for Empowering GUI Automation
TinyClick: Single-Turn Agent for Empowering GUI Automation
Pawel Pawlowski
Krystian Zawistowski
Wojciech Lapacz
Marcin Skorupa
Adam Wiacek
Sebastien Postansque
Jakub Hoscilowicz
LRMLLMAGMLLM
379
9
0
09 Oct 2024
NaviQAte: Functionality-Guided Web Application Navigation
NaviQAte: Functionality-Guided Web Application Navigation
M. Shahbandeh
Parsa Alian
Noor Nashid
Ali Mesbah
225
8
0
16 Sep 2024
AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?
AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?
Ori Yoran
S. Amouyal
Chaitanya Malaviya
Ben Bogin
Ofir Press
Jonathan Berant
LLMAG
342
71
0
22 Jul 2024
MobileAgentBench: An Efficient and User-Friendly Benchmark for Mobile
  LLM Agents
MobileAgentBench: An Efficient and User-Friendly Benchmark for Mobile LLM Agents
Luyuan Wang
Yongyu Deng
Yiwei Zha
Guodong Mao
Qinmin Wang
Tianchen Min
Wei Chen
Shoufa Chen
LLMAG
188
44
0
12 Jun 2024
Benchmarking Mobile Device Control Agents across Diverse Configurations
Benchmarking Mobile Device Control Agents across Diverse Configurations
Juyong Lee
Taywon Min
Minyong An
Dongyoon Hahm
Kimin Lee
Changyeon Kim
Kimin Lee
328
29
0
25 Apr 2024
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs
Keen You
Haotian Zhang
E. Schoop
Floris Weers
Amanda Swearngin
Jeffrey Nichols
Yinfei Yang
Zhe Gan
MLLM
341
146
0
08 Apr 2024
Tur[k]ingBench: A Challenge Benchmark for Web Agents
Tur[k]ingBench: A Challenge Benchmark for Web Agents
Kevin Xu
Yeganeh Kordi
Kate Sanders
Yizhong Wang
Adam Byerly
Kate Sanders
Adam Byerly
Jingyu Zhang
Benjamin Van Durme
Daniel Khashabi
LLMAG
491
16
0
18 Mar 2024
OmniACT: A Dataset and Benchmark for Enabling Multimodal Generalist
  Autonomous Agents for Desktop and Web
OmniACT: A Dataset and Benchmark for Enabling Multimodal Generalist Autonomous Agents for Desktop and Web
Raghav Kapoor
Y. Butala
M. Russak
Jing Yu Koh
Kiran Kamble
Waseem Alshikh
Ruslan Salakhutdinov
LLMAG
471
103
0
27 Feb 2024
SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents
SeeClick: Harnessing GUI Grounding for Advanced Visual GUI AgentsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Kanzhi Cheng
Qiushi Sun
Yougang Chu
Fangzhi Xu
Yantao Li
Jianbing Zhang
Zhiyong Wu
LLMAG
663
344
0
17 Jan 2024
ASSISTGUI: Task-Oriented Desktop Graphical User Interface Automation
ASSISTGUI: Task-Oriented Desktop Graphical User Interface Automation
Difei Gao
Lei Ji
Zechen Bai
Mingyu Ouyang
Peiran Li
...
Peiyi Wang
Xiangwu Guo
Hengxu Wang
Luowei Zhou
Mike Zheng Shou
LLMAG
297
34
0
20 Dec 2023
UINav: A Practical Approach to Train On-Device Automation Agents
UINav: A Practical Approach to Train On-Device Automation AgentsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023
Wei Li
Fu-Lin Hsu
Will Bishop
Folawiyo Campbell-Ajala
Max Lin
Oriana Riva
511
4
0
15 Dec 2023
Reinforced UI Instruction Grounding: Towards a Generic UI Task
  Automation API
Reinforced UI Instruction Grounding: Towards a Generic UI Task Automation API
Zhizheng Zhang
Wenxuan Xie
Xiaoyi Zhang
Yan Lu
189
16
0
07 Oct 2023
A Real-World WebAgent with Planning, Long Context Understanding, and
  Program Synthesis
A Real-World WebAgent with Planning, Long Context Understanding, and Program SynthesisInternational Conference on Learning Representations (ICLR), 2023
Izzeddin Gur
Hiroki Furuta
Austin Huang
Mustafa Safdari
Yutaka Matsuo
Douglas Eck
Aleksandra Faust
LM&RoLLMAG
550
307
0
24 Jul 2023
Android in the Wild: A Large-Scale Dataset for Android Device Control
Android in the Wild: A Large-Scale Dataset for Android Device ControlNeural Information Processing Systems (NeurIPS), 2023
Christopher Rawles
Alice Li
Daniel Rodriguez
Oriana Riva
Timothy Lillicrap
LM&Ro
396
249
0
19 Jul 2023
Synapse: Trajectory-as-Exemplar Prompting with Memory for Computer
  Control
Synapse: Trajectory-as-Exemplar Prompting with Memory for Computer ControlInternational Conference on Learning Representations (ICLR), 2023
Longtao Zheng
Rongpin Wang
Xinrun Wang
Bo An
LLMAG
356
97
0
13 Jun 2023
From Pixels to UI Actions: Learning to Follow Instructions via Graphical
  User Interfaces
From Pixels to UI Actions: Learning to Follow Instructions via Graphical User InterfacesNeural Information Processing Systems (NeurIPS), 2023
Peter Shaw
Mandar Joshi
James Cohan
Jonathan Berant
Panupong Pasupat
Hexiang Hu
Urvashi Khandelwal
Kenton Lee
Kristina Toutanova
LLMAGLM&Ro
253
74
0
31 May 2023
Towards Cognitive Bots: Architectural Research Challenges
Towards Cognitive Bots: Architectural Research ChallengesArtificial General Intelligence (AGI), 2023
Habtom Kahsay Gidey
Peter Hillmann
A. Karcher
Alois Knoll
107
7
0
26 May 2023
Multimodal Web Navigation with Instruction-Finetuned Foundation Models
Multimodal Web Navigation with Instruction-Finetuned Foundation ModelsInternational Conference on Learning Representations (ICLR), 2023
Hiroki Furuta
Kuang-Huei Lee
Ofir Nachum
Yutaka Matsuo
Aleksandra Faust
S. Gu
Izzeddin Gur
LM&Ro
393
140
0
19 May 2023
A Suite of Generative Tasks for Multi-Level Multimodal Webpage
  Understanding
A Suite of Generative Tasks for Multi-Level Multimodal Webpage UnderstandingConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Andrea Burns
Krishna Srinivasan
Joshua Ainslie
Geoff Brown
Bryan A. Plummer
Kate Saenko
Jianmo Ni
Mandy Guo
3DV
204
15
0
05 May 2023
Language Models can Solve Computer Tasks
Language Models can Solve Computer TasksNeural Information Processing Systems (NeurIPS), 2023
Geunwoo Kim
Pierre Baldi
Alexander Shmakov
LLMAGLM&Ro
522
459
0
30 Mar 2023
Augmented Language Models: a Survey
Augmented Language Models: a Survey
Grégoire Mialon
Roberto Dessì
Maria Lomeli
Christoforos Nalmpantis
Ramakanth Pasunuru
...
Jane Dwivedi-Yu
Asli Celikyilmaz
Edouard Grave
Yann LeCun
Thomas Scialom
LRMKELM
254
482
0
15 Feb 2023
Lexi: Self-Supervised Learning of the UI Language
Lexi: Self-Supervised Learning of the UI LanguageConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Pratyay Banerjee
Shweti Mahajan
Kushal Arora
Chitta Baral
Oriana Riva
105
18
0
23 Jan 2023
Understanding HTML with Large Language Models
Understanding HTML with Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Izzeddin Gur
Ofir Nachum
Yingjie Miao
Mustafa Safdari
Austin Huang
Aakanksha Chowdhery
Sharan Narang
Noah Fiedel
Aleksandra Faust
AI4CE
460
82
0
08 Oct 2022
MUG: Interactive Multimodal Grounding on User Interfaces
MUG: Interactive Multimodal Grounding on User InterfacesFindings (Findings), 2022
Tao Li
Gang Li
Jingjie Zheng
Purple Wang
Yang Li
LLMAG
174
10
0
29 Sep 2022
WebShop: Towards Scalable Real-World Web Interaction with Grounded
  Language Agents
WebShop: Towards Scalable Real-World Web Interaction with Grounded Language AgentsNeural Information Processing Systems (NeurIPS), 2022
Shunyu Yao
Howard Chen
John Yang
Karthik Narasimhan
LLMAGLM&Ro
763
740
0
04 Jul 2022
Fast Inference and Transfer of Compositional Task Structures for
  Few-shot Task Generalization
Fast Inference and Transfer of Compositional Task Structures for Few-shot Task GeneralizationConference on Uncertainty in Artificial Intelligence (UAI), 2022
Sungryull Sohn
Hyunjae Woo
Jongwook Choi
lyubing qiang
Izzeddin Gur
Aleksandra Faust
Honglak Lee
BDLOffRL
222
3
0
25 May 2022
Do BERTs Learn to Use Browser User Interface? Exploring Multi-Step Tasks
  with Unified Vision-and-Language BERTs
Do BERTs Learn to Use Browser User Interface? Exploring Multi-Step Tasks with Unified Vision-and-Language BERTs
Taichi Iki
Akiko Aizawa
LLMAG
175
6
0
15 Mar 2022
A data-driven approach for learning to control computers
A data-driven approach for learning to control computersInternational Conference on Machine Learning (ICML), 2022
Peter C. Humphreys
David Raposo
Tobias Pohlen
Gregory Thornton
Rachita Chhaparia
...
Josh Abramson
Petko Georgiev
Alex Goldin
Adam Santoro
Timothy Lillicrap
311
115
0
16 Feb 2022
Environment Generation for Zero-Shot Compositional Reinforcement
  Learning
Environment Generation for Zero-Shot Compositional Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2022
Izzeddin Gur
Natasha Jaques
Yingjie Miao
Jongwook Choi
Manoj Kumar Tiwari
Honglak Lee
Aleksandra Faust
242
45
0
21 Jan 2022
WebGPT: Browser-assisted question-answering with human feedback
WebGPT: Browser-assisted question-answering with human feedback
Reiichiro Nakano
Jacob Hilton
S. Balaji
Jeff Wu
Ouyang Long
...
Gretchen Krueger
Kevin Button
Matthew Knight
B. Chess
John Schulman
ALMRALM
458
1,601
0
17 Dec 2021
Learning UI Navigation through Demonstrations composed of Macro Actions
Learning UI Navigation through Demonstrations composed of Macro Actions
Wei Li
LLMAG
133
9
0
16 Oct 2021
AppBuddy: Learning to Accomplish Tasks in Mobile Apps via Reinforcement
  Learning
AppBuddy: Learning to Accomplish Tasks in Mobile Apps via Reinforcement Learning
Maayan Shvo
Zhiming Hu
Rodrigo Toro Icarte
Iqbal Mohomed
A. Jepson
Sheila A. McIlraith
176
16
0
31 May 2021
Adversarial Environment Generation for Learning to Navigate the Web
Adversarial Environment Generation for Learning to Navigate the Web
Izzeddin Gur
Natasha Jaques
Kevin Malta
Manoj Kumar Tiwari
Honglak Lee
Aleksandra Faust
215
18
0
02 Mar 2021
Rapid Task-Solving in Novel Environments
Rapid Task-Solving in Novel Environments
Samuel Ritter
Ryan Faulkner
Laurent Sartran
Adam Santoro
M. Botvinick
David Raposo
163
30
0
05 Jun 2020
Evolving Rewards to Automate Reinforcement Learning
Evolving Rewards to Automate Reinforcement Learning
Aleksandra Faust
Anthony G. Francis
Dar Mehta
200
52
0
18 May 2019
1