v1v2 (latest)

Grounding Open-Domain Instructions to Automate Web Support Tasks

North American Chapter of the Association for Computational Linguistics (NAACL), 2021

30 March 2021

Papers citing "Grounding Open-Domain Instructions to Automate Web Support Tasks"

37 / 37 papers shown

LegalWebAgent: Empowering Access to Justice via LLM-Based Web Agents

Jinzhe Tan

Karim Benyekhlef

AILaw 3DV

279

28 Nov 2025

AgentPRM: Process Reward Models for LLM Agents via Step-Wise Promise and Progress

...

132

11 Nov 2025

V2P: Visual Attention Calibration for GUI Grounding via Background Suppression and Center Peaking

Leilei Gan

Chenyi Zhuang

Jinjie Gu

109

19 Aug 2025

Mind the Web: The Security of Web Use Agents

262

08 Jun 2025

X-WebAgentBench: A Multilingual Interactive Web Benchmark for Evaluating Global Agentic SystemAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

278

21 May 2025

Boosting Virtual Agent Learning and Reasoning: A Step-Wise, Multi-Dimensional, and Generalist Reward Model with Benchmark

420

24 Mar 2025

VisEscape: A Benchmark for Evaluating Exploration-driven Decision-making in Virtual Escape Rooms

471

18 Mar 2025

Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web AgentsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

866

17 Feb 2025

WebWalker: Benchmarking LLMs in Web TraversalAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

...

619

13 Jan 2025

Generalist Virtual Agents: A Survey on Autonomous Agents Across Digital Platforms

313

17 Nov 2024

Foundations and Recent Trends in Multimodal Mobile Agents: A Survey

LM&Ro LLMAG OffRL AI4TS

426

04 Nov 2024

Infogent: An Agent-Based Framework for Web Information AggregationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

Zhenhailong Wang

244

24 Oct 2024

Large Language Models Empowered Personalized Web AgentsThe Web Conference (WWW), 2024

Yongqi Li

487

22 Oct 2024

Beyond Browsing: API-Based Web AgentsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

588

21 Oct 2024

ST-WebAgentBench: A Benchmark for Evaluating Safety and Trustworthiness in Web Agents

607

09 Oct 2024

A Survey on Complex Tasks for Goal-Directed Interactive Agents

Mareike Hartmann

Alexander Koller

LM&Ro LLMAG

293

27 Sep 2024

NaviQAte: Functionality-Guided Web Application Navigation

261

16 Sep 2024

Identifying User Goals from UI Trajectories

349

20 Jun 2024

GUICourse: From General Vision Language Models to Versatile GUI Agents

...

411

17 Jun 2024

MMInA: Benchmarking Multihop Multimodal Internet Agents

312

15 Apr 2024

Autonomous Evaluation and Refinement of Digital Agents

507

09 Apr 2024

WILBUR: Adaptive In-Context Learning for Robust and Accurate Web Agents

199

08 Apr 2024

Tur[k]ingBench: A Challenge Benchmark for Web Agents

Kate Sanders

Adam Byerly

Jingyu Zhang

Benjamin Van Durme

Daniel Khashabi

LLMAG

537

18 Mar 2024

OmniACT: A Dataset and Benchmark for Enabling Multimodal Generalist Autonomous Agents for Desktop and Web

493

105

27 Feb 2024

WebLINX: Real-World Website Navigation with Multi-Turn Dialogue

Xing Han Lù

Zdeněk Kasner

Siva Reddy

309

118

08 Feb 2024

WebVLN: Vision-and-Language Navigation on Websites

Qi Wu

241

25 Dec 2023

RoboGPT: an intelligent agent of making embodied long-term decisions for daily instruction tasks

226

27 Nov 2023

Multi-Level Compositional Reasoning for Interactive Instruction FollowingAAAI Conference on Artificial Intelligence (AAAI), 2023

269

18 Aug 2023

WebArena: A Realistic Web Environment for Building Autonomous AgentsInternational Conference on Learning Representations (ICLR), 2023

Xuhui Zhou

...

Daniel Fried

Graham Neubig

681

840

25 Jul 2023

Referring to Screen Texts with Voice AssistantsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

206

10 Jun 2023

Mind2Web: Towards a Generalist Agent for the WebNeural Information Processing Systems (NeurIPS), 2023

Huan Sun

601

773

09 Jun 2023

Can Current Task-oriented Dialogue Models Automate Real-world Scenarios in the Wild?

...

291

20 Dec 2022

UGIF: UI Grounded Instruction Following

S. Venkatesh

Partha P. Talukdar

S. Narayanan

286

14 Nov 2022

META-GUI: Towards Multi-modal Conversational Agents on Mobile GUIConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

290

23 May 2022

Multimodal Conversational AI: A Survey of Datasets and Approaches

Anirudh S. Sundar

Larry Heck

166

13 May 2022

Procedures as Programs: Hierarchical Control of Situated Agents through Natural Language

Shuyan Zhou

Pengcheng Yin

Graham Neubig

LM&Ro

231

16 Sep 2021

mForms : Multimodal Form-Filling with Question AnsweringInternational Conference on Language Resources and Evaluation (LREC), 2020

Larry Heck

S. Heck

Anirudh S. Sundar

357

24 Nov 2020