ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.10445
  4. Cited By
RealWebAssist: A Benchmark for Long-Horizon Web Assistance with Real-World Users
v1v2 (latest)

RealWebAssist: A Benchmark for Long-Horizon Web Assistance with Real-World Users

14 April 2025
Suyu Ye
Haojun Shi
Darren Shih
Hyokun Yun
Tanya Roosta
Tianmin Shu
ArXiv (abs)PDFHTML

Papers citing "RealWebAssist: A Benchmark for Long-Horizon Web Assistance with Real-World Users"

10 / 10 papers shown
Evaluating Long-Context Reasoning in LLM-Based WebAgents
Evaluating Long-Context Reasoning in LLM-Based WebAgents
Andy Chung
Yichi Zhang
Kaixiang Lin
Aditya Rawal
Qiaozi Gao
Joyce Chai
LLMAGLRM
120
1
0
03 Dec 2025
OpenApps: Simulating Environment Variations to Measure UI-Agent Reliability
OpenApps: Simulating Environment Variations to Measure UI-Agent Reliability
Karen Ullrich
Jingtong Su
Claudia Shi
Arjun Subramonian
Amir Bar
Ivan Evtimov
Nikolaos Tsilivis
Randall Balestriero
Julia Kempe
Mark Ibrahim
117
0
0
25 Nov 2025
Can Agent Conquer Web? Exploring the Frontiers of ChatGPT Atlas Agent in Web Games
Can Agent Conquer Web? Exploring the Frontiers of ChatGPT Atlas Agent in Web Games
Jingran Zhang
Ning Li
Justin Cui
LLMAGLM&RoLRM
170
1
0
30 Oct 2025
ColorBench: Benchmarking Mobile Agents with Graph-Structured Framework for Complex Long-Horizon Tasks
ColorBench: Benchmarking Mobile Agents with Graph-Structured Framework for Complex Long-Horizon Tasks
Yuanyi Song
Heyuan Huang
Qiqiang Lin
Yin Zhao
Xiangmou Qu
...
Zhuosheng Zhang
Jun Wang
Yong Yu
Weinan Zhang
Zhaoxiang Wang
LLMAGOffRL
125
1
0
16 Oct 2025
Interaction-Driven Browsing: A Human-in-the-Loop Conceptual Framework Informed by Human Web Browsing for Browser-Using Agents
Interaction-Driven Browsing: A Human-in-the-Loop Conceptual Framework Informed by Human Web Browsing for Browser-Using Agents
Hyeonggeun Yun
Jinkyu Jang
152
1
0
15 Sep 2025
NatureGAIA: Pushing the Frontiers of GUI Agents with a Challenging Benchmark and High-Quality Trajectory Dataset
NatureGAIA: Pushing the Frontiers of GUI Agents with a Challenging Benchmark and High-Quality Trajectory Dataset
Zihan Zheng
Tianle Cui
Chuwen Xie
Jiahui Zhang
Jiahui Pan
Lewei He
Qianglong Chen
LLMAG
187
0
0
02 Aug 2025
Phi-Ground Tech Report: Advancing Perception in GUI Grounding
Phi-Ground Tech Report: Advancing Perception in GUI Grounding
Miaosen Zhang
Ziqiang Xu
Jialiang Zhu
Qi Dai
Kai Qiu
...
Chong Luo
Tianyi Chen
Justin Wagle
Tim Franklin
Baining Guo
LRM
234
10
0
31 Jul 2025
EconWebArena: Benchmarking Autonomous Agents on Economic Tasks in Realistic Web Environments
Zefang Liu
Yinzhu Quan
207
5
0
09 Jun 2025
MARS-Bench: A Multi-turn Athletic Real-world Scenario Benchmark for Dialogue Evaluation
MARS-Bench: A Multi-turn Athletic Real-world Scenario Benchmark for Dialogue Evaluation
Chenghao Yang
Yinbo Luo
Zhoufutu Wen
Qi Chu
Tao Gong
...
Kaiyuan Zhang
Jianpeng Jiao
Ge Zhang
Wenhao Huang
Nenghai Yu
LLMAGLRM
186
1
0
27 May 2025
VideoWebArena: Evaluating Long Context Multimodal Agents with Video Understanding Web Tasks
VideoWebArena: Evaluating Long Context Multimodal Agents with Video Understanding Web TasksInternational Conference on Learning Representations (ICLR), 2024
Lawrence Jang
Yinheng Li
Charles Ding
Justin Lin
Paul Pu Liang
Dan Zhao
Rogerio Bonatti
K. Koishida
413
25
0
24 Oct 2024
1
Page 1 of 1