Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.17491
Cited By
Language Models can Solve Computer Tasks
30 March 2023
Geunwoo Kim
Pierre Baldi
Stephen Marcus McAleer
LLMAG
LM&Ro
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Language Models can Solve Computer Tasks"
50 / 256 papers shown
Title
Sailing AI by the Stars: A Survey of Learning from Rewards in Post-Training and Test-Time Scaling of Large Language Models
Xiaobao Wu
LRM
60
0
0
05 May 2025
ScaleTrack: Scaling and back-tracking Automated GUI Agents
Jing Huang
Zhixiong Zeng
WenKang Han
Yufeng Zhong
Liming Zheng
Shuai Fu
Jingyuan Chen
Lin Ma
48
0
0
01 May 2025
Visual Test-time Scaling for GUI Agent Grounding
Tiange Luo
Lajanugen Logeswaran
Justin Johnson
Honglak Lee
48
0
0
01 May 2025
Skill Discovery for Software Scripting Automation via Offline Simulations with LLMs
Paiheng Xu
Gang Wu
Xiang Chen
Tong Yu
Chang Xiao
Franck Dernoncourt
Tianyi Zhou
Wei Ai
Viswanathan Swaminathan
OffRL
50
0
0
29 Apr 2025
Nemotron-Research-Tool-N1: Exploring Tool-Using Language Models with Reinforced Reasoning
Shaokun Zhang
Yi Dong
Jieyu Zhang
Jan Kautz
Bryan Catanzaro
Andrew Tao
Qingyun Wu
Zhiding Yu
Guilin Liu
LLMAG
OffRL
KELM
LRM
86
0
0
25 Apr 2025
Cracking the Code of Action: a Generative Approach to Affordances for Reinforcement Learning
Lynn Cherif
Flemming Kondrup
David Venuto
Ankit Anand
Doina Precup
Khimya Khetarpal
LM&Ro
37
0
0
24 Apr 2025
From Reflection to Perfection: Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning
Le Zhuo
Liangbing Zhao
Sayak Paul
Yue Liao
Renrui Zhang
Yi Xin
Peng Gao
Mohamed Elhoseiny
H. Li
VLM
63
0
0
22 Apr 2025
Are AI agents the new machine translation frontier? Challenges and opportunities of single- and multi-agent systems for multilingual digital communication
Vicent Briva-Iglesias
LLMAG
29
0
0
17 Apr 2025
WebLists: Extracting Structured Information From Complex Interactive Websites Using Executable LLM Agents
Arth Bohra
Manvel Saroyan
Danil Melkozerov
Vahe Karufanyan
Gabriel Maher
Pascal Weinberger
Artem Harutyunyan
Giovanni Campagna
LLMAG
35
0
0
17 Apr 2025
Can LLMs Simulate Personas with Reversed Performance? A Benchmark for Counterfactual Instruction Following
Sai Adith Senthil Kumar
Hao Yan
Saipavan Perepa
Murong Yue
Ziyu Yao
54
0
0
08 Apr 2025
Navigating the Rabbit Hole: Emergent Biases in LLM-Generated Attack Narratives Targeting Mental Health Groups
Rijul Magu
Arka Dutta
Sean Kim
Ashiqur R. KhudaBukhsh
Munmun De Choudhury
19
0
0
08 Apr 2025
On the Robustness of GUI Grounding Models Against Image Attacks
Haoren Zhao
Tianyi Chen
Zhen Wang
AAML
31
0
0
07 Apr 2025
Beyond Accuracy: The Role of Calibration in Self-Improving Large Language Models
Liangjie Huang
Dawei Li
Huan Liu
Lu Cheng
LRM
34
0
0
03 Apr 2025
A Survey of WebAgents: Towards Next-Generation AI Agents for Web Automation with Large Foundation Models
Liangbo Ning
Ziran Liang
Zhuohang Jiang
Haohao Qu
Yujuan Ding
...
Xiao Wei
Shanru Lin
Hui Liu
Philip S. Yu
Qing Li
LLMAG
LM&Ro
86
5
0
30 Mar 2025
LEMMA: Learning from Errors for MatheMatical Advancement in LLMs
Zhuoshi Pan
Yu-Hu Li
Honglin Lin
Qizhi Pei
Zinan Tang
Wei Yu Wu
Chenlin Ming
H. V. Zhao
Conghui He
Lijun Wu
LRM
59
0
0
21 Mar 2025
Dancing with Critiques: Enhancing LLM Reasoning with Stepwise Natural Language Self-Critique
Y. Li
Jiahao Xu
Tian Liang
Xingyu Chen
Zhiwei He
...
Rui Wang
Z. Zhang
Zhaopeng Tu
Haitao Mi
Dong Yu
LRM
43
1
0
21 Mar 2025
DeskVision: Large Scale Desktop Region Captioning for Advanced GUI Agents
Yibin Xu
Liang Yang
Hao Chen
Hua Wang
Zhi Chen
Yaohua Tang
3DV
56
0
0
14 Mar 2025
Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks
Lutfi Eren Erdogan
Nicholas Lee
Sehoon Kim
Suhong Moon
Hiroki Furuta
Gopala Anumanchipalli
K. K.
Amir Gholami
LLMAG
LM&Ro
AIFin
76
2
0
12 Mar 2025
Exploiting Edited Large Language Models as General Scientific Optimizers
Qitan Lv
T. Liu
H. Wang
36
0
0
08 Mar 2025
A Survey of Large Language Model Empowered Agents for Recommendation and Search: Towards Next-Generation Information Retrieval
Yu Zhang
Shutong Qiao
Jiaqi Zhang
Tzu-Heng Lin
Chen Gao
Y. Li
LM&Ro
LM&MA
84
0
0
07 Mar 2025
LLMs Can Generate a Better Answer by Aggregating Their Own Responses
Zichong Li
Xinyu Feng
Yuheng Cai
Zixuan Zhang
Tianyi Liu
Chen Liang
Weizhu Chen
Haoyu Wang
T. Zhao
LRM
50
1
0
06 Mar 2025
SafeArena: Evaluating the Safety of Autonomous Web Agents
Ada Defne Tur
Nicholas Meade
Xing Han Lù
Alejandra Zambrano
Arkil Patel
Esin Durmus
Spandana Gella
Karolina Stañczak
Siva Reddy
LLMAG
ELM
82
2
0
06 Mar 2025
Towards Understanding Multi-Round Large Language Model Reasoning: Approximability, Learnability and Generalizability
Chenhui Xu
Dancheng Liu
Jiajie Li
Amir Nassereldine
Zhaohui Li
Jinjun Xiong
LRM
54
0
0
05 Mar 2025
Language Models can Self-Improve at State-Value Estimation for Better Search
Ethan Mendes
Alan Ritter
LRM
52
3
0
04 Mar 2025
Instruct-of-Reflection: Enhancing Large Language Models Iterative Reflection Capabilities via Dynamic-Meta Instruction
Liping Liu
Chunhong Zhang
Likang Wu
Chuang Zhao
Zheng Hu
Ming He
Jianping Fan
LLMAG
LRM
36
0
0
02 Mar 2025
Two Heads Are Better Than One: Dual-Model Verbal Reflection at Inference-Time
Jiazheng Li
Yuxiang Zhou
Junru Lu
Gladys Tyen
Lin Gui
Cesare Aloisi
Yulan He
LRM
33
2
0
26 Feb 2025
GenTool: Enhancing Tool Generalization in Language Models through Zero-to-One and Weak-to-Strong Simulation
Jie He
Jennifer Neville
Mengting Wan
Longqi Yang
Hui Liu
Xiaofeng Xu
Xia Song
Jeff Z. Pan
Pei Zhou
LLMAG
SyDa
58
0
0
26 Feb 2025
Self-rewarding correction for mathematical reasoning
Wei Xiong
Hanning Zhang
Chenlu Ye
Lichang Chen
Nan Jiang
Tong Zhang
ReLM
KELM
LRM
61
9
0
26 Feb 2025
Independent Mobility GPT (IDM-GPT): A Self-Supervised Multi-Agent Large Language Model Framework for Customized Traffic Mobility Analysis Using Machine Learning Models
Fengze Yang
Xiaoyue Cathy Liu
Lingjiu Lu
Bingzhang Wang
Chenxi
35
0
0
25 Feb 2025
RePrompt: Planning by Automatic Prompt Engineering for Large Language Models Agents
Weizhe Chen
Sven Koenig
B. Dilkina
LLMAG
100
8
0
17 Feb 2025
AgentStudio: A Toolkit for Building General Virtual Agents
Longtao Zheng
Zhiyuan Huang
Zhenghai Xue
Xinrun Wang
Bo An
Shuicheng Yan
75
14
0
17 Feb 2025
LLMs can implicitly learn from mistakes in-context
Lisa Alazraki
Maximilian Mozes
Jon Ander Campos
Yi Chern Tan
Marek Rei
Max Bartolo
ReLM
LRM
90
0
0
12 Feb 2025
WebWalker: Benchmarking LLMs in Web Traversal
Jialong Wu
Wenbiao Yin
Yong-feng Jiang
Zhenglin Wang
Zekun Xi
...
Linhai Zhang
Yulan He
Deyu Zhou
Pengjun Xie
Fei Huang
41
5
0
13 Jan 2025
Understanding Before Reasoning: Enhancing Chain-of-Thought with Iterative Summarization Pre-Prompting
Dong-Hai Zhu
Yu-Jie Xiong
Jia-Chen Zhang
Xi-Jiong Xie
Chun-Ming Xia
ReLM
LRM
37
0
0
08 Jan 2025
Exposing Limitations of Language Model Agents in Sequential-Task Compositions on the Web
Hiroki Furuta
Yutaka Matsuo
Aleksandra Faust
Izzeddin Gur
CLL
80
13
0
03 Jan 2025
LLM+AL: Bridging Large Language Models and Action Languages for Complex Reasoning about Actions
Adam Ishay
Joohyung Lee
LRM
37
1
0
01 Jan 2025
Towards Intrinsic Self-Correction Enhancement in Monte Carlo Tree Search Boosted Reasoning via Iterative Preference Learning
Huchen Jiang
Yangyang Ma
Chaofan Ding
Kexin Luan
Xinhan Di
ReLM
LRM
31
2
0
23 Dec 2024
Falcon-UI: Understanding GUI Before Following User Instructions
Huawen Shen
Chang-Shu Liu
Gengluo Li
Xinlong Wang
Yu Zhou
Can Ma
Xiangyang Ji
LLMAG
77
4
0
12 Dec 2024
The BrowserGym Ecosystem for Web Agent Research
Thibault Le Sellier De Chezelles
Maxime Gasse
Alexandre Lacoste
Alexandre Drouin
Massimo Caccia
...
Siva Reddy
Quentin Cappart
Graham Neubig
Ruslan Salakhutdinov
Nicolas Chapados
LLMAG
96
9
0
06 Dec 2024
Autonomous Industrial Control using an Agentic Framework with Large Language Models
Javal Vyas
Mehmet Mercangöz
AI4CE
LLMAG
36
2
0
08 Nov 2024
GUI Agents with Foundation Models: A Comprehensive Survey
Shuai Wang
W. Liu
Jingxuan Chen
Weinan Gan
Xingshan Zeng
...
Bin Wang
Chuhan Wu
Yasheng Wang
Ruiming Tang
Jianye Hao
LLMAG
65
12
0
07 Nov 2024
DynaSaur: Large Language Agents Beyond Predefined Actions
Dang Nguyen
Viet Dac Lai
Seunghyun Yoon
Ryan Rossi
Handong Zhao
...
Nedim Lipka
Yu-Chiang Frank Wang
Trung H. Bui
Franck Dernoncourt
Tianyi Zhou
LM&Ro
ELM
LLMAG
39
6
0
04 Nov 2024
EDGE: Enhanced Grounded GUI Understanding with Enriched Multi-Granularity Synthetic Data
Xuetian Chen
Hangcheng Li
Jiaqing Liang
Sihang Jiang
Deqing Yang
LLMAG
46
2
0
25 Oct 2024
CorrectionLM: Self-Corrections with SLM for Dialogue State Tracking
Chia-Hsuan Lee
Hao Cheng
Mari Ostendorf
LRM
23
0
0
23 Oct 2024
SMART: Self-learning Meta-strategy Agent for Reasoning Tasks
Rongxing Liu
Kumar Shridhar
Manish Prajapat
Patrick Xia
Mrinmaya Sachan
LLMAG
LRM
23
3
0
21 Oct 2024
Web Agents with World Models: Learning and Leveraging Environment Dynamics in Web Navigation
Hyungjoo Chae
Namyoung Kim
Kai Tzu-iunn Ong
Minju Gwak
Gwanwoo Song
Jihoon Kim
S. Kim
Dongha Lee
Jinyoung Yeo
LLMAG
30
13
0
17 Oct 2024
Enhancing Mathematical Reasoning in LLMs by Stepwise Correction
Zhenyu Wu
Qingkai Zeng
Z. Zhang
Zhaoxuan Tan
Chao Shen
Meng-Long Jiang
KELM
LRM
31
0
0
16 Oct 2024
Automatic Mapping of Anatomical Landmarks from Free-Text Using Large Language Models: Insights from Llama-2
Mohamad Abdi
Gerardo Hermosillo Valadez
H. Yerebakan
MedIm
16
0
0
16 Oct 2024
SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction
L. Yang
Zhaochen Yu
T. Zhang
Minkai Xu
Joseph E. Gonzalez
Bin Cui
Shuicheng Yan
ELM
ReLM
LRM
44
0
0
11 Oct 2024
Agent S: An Open Agentic Framework that Uses Computers Like a Human
Saaket Agashe
Jiuzhou Han
Shuyu Gan
Jiachen Yang
Ang Li
Xin Eric Wang
LLMAG
LM&Ro
AIFin
36
19
0
10 Oct 2024
1
2
3
4
5
6
Next