ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1808.09132
  4. Cited By
Mapping Natural Language Commands to Web Elements
v1v2 (latest)

Mapping Natural Language Commands to Web Elements

28 August 2018
Panupong Pasupat
Tianrui Jiang
Emmy Liu
Kelvin Guu
Abigail Z. Jacobs
ArXiv (abs)PDFHTML

Papers citing "Mapping Natural Language Commands to Web Elements"

24 / 24 papers shown
Title
OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use
OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use
Xueyu Hu
Tao Xiong
Biao Yi
Zishu Wei
Ruixuan Xiao
...
Zhou Zhao
Hongxia Yang
Fan Wu
Shengyu Zhang
Fei Wu
LLMAGLM&RoAI4TS
234
29
0
06 Aug 2025
Navigating WebAI: Training Agents to Complete Web Tasks with Large
  Language Models and Reinforcement Learning
Navigating WebAI: Training Agents to Complete Web Tasks with Large Language Models and Reinforcement Learning
Lucas-Andrei Thil
Mirela Popa
Gerasimos Spanakis
LLMAG
137
5
0
01 May 2024
Graph4GUI: Graph Neural Networks for Representing Graphical User
  Interfaces
Graph4GUI: Graph Neural Networks for Representing Graphical User Interfaces
Yue Jiang
Changkong Zhou
Vikas Garg
Antti Oulasvirta
209
13
0
21 Apr 2024
Tur[k]ingBench: A Challenge Benchmark for Web Agents
Tur[k]ingBench: A Challenge Benchmark for Web Agents
Kevin Xu
Yeganeh Kordi
Kate Sanders
Yizhong Wang
Adam Byerly
Kate Sanders
Adam Byerly
Jingyu Zhang
Benjamin Van Durme
Daniel Khashabi
LLMAG
499
16
0
18 Mar 2024
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
Xing Han Lù
Zdeněk Kasner
Siva Reddy
296
117
0
08 Feb 2024
ASSISTGUI: Task-Oriented Desktop Graphical User Interface Automation
ASSISTGUI: Task-Oriented Desktop Graphical User Interface Automation
Difei Gao
Lei Ji
Zechen Bai
Mingyu Ouyang
Peiran Li
...
Peiyi Wang
Xiangwu Guo
Hengxu Wang
Luowei Zhou
Mike Zheng Shou
LLMAG
305
35
0
20 Dec 2023
DiLogics: Creating Web Automation Programs With Diverse Logics
DiLogics: Creating Web Automation Programs With Diverse LogicsACM Symposium on User Interface Software and Technology (UIST), 2023
Kevin Pu
Jim Yang
Angel Yuan
Minyi Ma
Rui Dong
Boyu Han
Yuanchun Chen
Tovi Grossman
99
12
0
10 Aug 2023
Referring to Screen Texts with Voice Assistants
Referring to Screen Texts with Voice AssistantsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Shruti Bhargava
Anand Dhoot
I. Jonsson
Hoang Long Nguyen
Alkesh Patel
Hong-ye Yu
Vincent Renkens
205
2
0
10 Jun 2023
DroidBot-GPT: GPT-powered UI Automation for Android
DroidBot-GPT: GPT-powered UI Automation for Android
Hao Wen
Hongmin Wang
Jiaxuan Liu
Yan Liang
LM&RoLM&MA
416
60
0
14 Apr 2023
Language Models can Solve Computer Tasks
Language Models can Solve Computer TasksNeural Information Processing Systems (NeurIPS), 2023
Geunwoo Kim
Pierre Baldi
Alexander Shmakov
LLMAGLM&Ro
530
460
0
30 Mar 2023
Lexi: Self-Supervised Learning of the UI Language
Lexi: Self-Supervised Learning of the UI LanguageConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Pratyay Banerjee
Shweti Mahajan
Kushal Arora
Chitta Baral
Oriana Riva
117
18
0
23 Jan 2023
Understanding HTML with Large Language Models
Understanding HTML with Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Izzeddin Gur
Ofir Nachum
Yingjie Miao
Mustafa Safdari
Austin Huang
Aakanksha Chowdhery
Sharan Narang
Noah Fiedel
Aleksandra Faust
AI4CE
476
82
0
08 Oct 2022
MUG: Interactive Multimodal Grounding on User Interfaces
MUG: Interactive Multimodal Grounding on User InterfacesFindings (Findings), 2022
Tao Li
Gang Li
Jingjie Zheng
Purple Wang
Yang Li
LLMAG
174
10
0
29 Sep 2022
Enabling Conversational Interaction with Mobile UI using Large Language
  Models
Enabling Conversational Interaction with Mobile UI using Large Language ModelsInternational Conference on Human Factors in Computing Systems (CHI), 2022
Bryan Wang
Gang Li
Yang Li
400
172
0
18 Sep 2022
WebShop: Towards Scalable Real-World Web Interaction with Grounded
  Language Agents
WebShop: Towards Scalable Real-World Web Interaction with Grounded Language AgentsNeural Information Processing Systems (NeurIPS), 2022
Shunyu Yao
Howard Chen
John Yang
Karthik Narasimhan
LLMAGLM&Ro
767
741
0
04 Jul 2022
META-GUI: Towards Multi-modal Conversational Agents on Mobile GUI
META-GUI: Towards Multi-modal Conversational Agents on Mobile GUIConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Liangtai Sun
Xingyu Chen
Lu Chen
Tianle Dai
Zichen Zhu
Kai Yu
LLMAG
270
84
0
23 May 2022
A Dataset for Interactive Vision-Language Navigation with Unknown
  Command Feasibility
A Dataset for Interactive Vision-Language Navigation with Unknown Command FeasibilityEuropean Conference on Computer Vision (ECCV), 2022
Andrea Burns
Deniz Arsan
Sanjna Agrawal
Ranjitha Kumar
Kate Saenko
Bryan A. Plummer
409
80
0
04 Feb 2022
VUT: Versatile UI Transformer for Multi-Modal Multi-Task User Interface
  Modeling
VUT: Versatile UI Transformer for Multi-Modal Multi-Task User Interface Modeling
Yang Li
Gang Li
Xin Zhou
Mostafa Dehghani
A. Gritsenko
MLLM
159
38
0
10 Dec 2021
Grounding Natural Language Instructions: Can Large Language Models
  Capture Spatial Information?
Grounding Natural Language Instructions: Can Large Language Models Capture Spatial Information?
Julia Rozanova
Deborah Ferreira
K. Dubba
Weiwei Cheng
Dell Zhang
André Freitas
LM&Ro
154
12
0
17 Sep 2021
Mobile App Tasks with Iterative Feedback (MoTIF): Addressing Task
  Feasibility in Interactive Visual Environments
Mobile App Tasks with Iterative Feedback (MoTIF): Addressing Task Feasibility in Interactive Visual Environments
Andrea Burns
Deniz Arsan
Sanjna Agrawal
Ranjitha Kumar
Kate Saenko
Bryan A. Plummer
LRM
176
27
0
17 Apr 2021
Grounding Open-Domain Instructions to Automate Web Support Tasks
Grounding Open-Domain Instructions to Automate Web Support TasksNorth American Chapter of the Association for Computational Linguistics (NAACL), 2021
N. Xu
Sam Masling
Michael Du
Giovanni Campagna
Larry Heck
James A. Landay
M. Lam
LLMAGAI4TS
295
51
0
30 Mar 2021
Screen2Vec: Semantic Embedding of GUI Screens and GUI Components
Screen2Vec: Semantic Embedding of GUI Screens and GUI ComponentsInternational Conference on Human Factors in Computing Systems (CHI), 2021
Toby Jia-Jun Li
Lindsay Popowski
Tom Michael Mitchell
Brad A. Myers
237
115
0
11 Jan 2021
FLIN: A Flexible Natural Language Interface for Web Navigation
FLIN: A Flexible Natural Language Interface for Web NavigationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2020
Sahisnu Mazumder
Oriana Riva
LRM
324
27
0
24 Oct 2020
Building an Application Independent Natural Language Interface
Building an Application Independent Natural Language Interface
Sahisnu Mazumder
Bing-Quan Liu
Shuai Wang
Sepideh Esmaeilpour
112
3
0
30 Oct 2019
1