ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.09332
  4. Cited By
WebGPT: Browser-assisted question-answering with human feedback
v1v2v3 (latest)

WebGPT: Browser-assisted question-answering with human feedback

17 December 2021
Reiichiro Nakano
Jacob Hilton
S. Balaji
Jeff Wu
Ouyang Long
Christina Kim
Christopher Hesse
Shantanu Jain
V. Kosaraju
William Saunders
Xu Jiang
K. Cobbe
Tyna Eloundou
Gretchen Krueger
Kevin Button
Matthew Knight
B. Chess
John Schulman
    ALMRALM
ArXiv (abs)PDFHTMLHuggingFace (2 upvotes)

Papers citing "WebGPT: Browser-assisted question-answering with human feedback"

50 / 1,123 papers shown
Rethinking Stateful Tool Use in Multi-Turn Dialogues: Benchmarks and Challenges
Rethinking Stateful Tool Use in Multi-Turn Dialogues: Benchmarks and ChallengesAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Hongru Wang
Wenyu Huang
Yufei Wang
Yuanhao Xi
Jianqiao Lu
Huan Zhang
Nan Hu
Zeming Liu
Jeff Z. Pan
Kam-Fai Wong
LLMAG
315
7
0
19 May 2025
Pairwise Calibrated Rewards for Pluralistic Alignment
Pairwise Calibrated Rewards for Pluralistic Alignment
Daniel Halpern
Evi Micha
Ariel D. Procaccia
Itai Shapira
220
0
0
17 May 2025
Reinforcing Multi-Turn Reasoning in LLM Agents via Turn-Level Reward Design
Reinforcing Multi-Turn Reasoning in LLM Agents via Turn-Level Reward Design
Siliang Zeng
Quan Wei
William Brown
Oana Frunza
Oana Frunza
...
Anderson Schneider
Yuriy Nevmyvaka
Yang Katie Zhao
Alfredo García
Mingyi Hong
LRM
349
22
0
17 May 2025
PeerGuard: Defending Multi-Agent Systems Against Backdoor Attacks Through Mutual Reasoning
PeerGuard: Defending Multi-Agent Systems Against Backdoor Attacks Through Mutual ReasoningIEEE International Conference on Information Reuse and Integration (IRI), 2025
Falong Fan
Xi Li
LLMAGAAML
336
6
0
16 May 2025
LLM Agents Are Hypersensitive to Nudges
LLM Agents Are Hypersensitive to Nudges
Manuel Cherep
Pattie Maes
Nikhil Singh
292
2
0
16 May 2025
Demystifying AI Agents: The Final Generation of Intelligence
Demystifying AI Agents: The Final Generation of Intelligence
Kevin J McNamara
Rhea Pritham Marpu
188
2
0
15 May 2025
Scent of Knowledge: Optimizing Search-Enhanced Reasoning with Information Foraging
Scent of Knowledge: Optimizing Search-Enhanced Reasoning with Information Foraging
Hongjin Qian
Zhengyang Liang
RALMLRM
514
6
0
14 May 2025
Ethics and Persuasion in Reinforcement Learning from Human Feedback: A Procedural Rhetorical Approach
Ethics and Persuasion in Reinforcement Learning from Human Feedback: A Procedural Rhetorical ApproachEthics: An International Journal of Social, Political, and Legal Philosophy (Ethics), 2025
Shannon Lodoen
Alexi Orchard
243
0
0
14 May 2025
HealthBench: Evaluating Large Language Models Towards Improved Human Health
HealthBench: Evaluating Large Language Models Towards Improved Human Health
Rahul Arora
Jason W. Wei
Rebecca Soskin Hicks
Preston Bowman
Joaquin Quiñonero Candela
...
Meghan Shah
Andrea Vallone
Alex Beutel
Johannes Heidecke
K. Singhal
LM&MAAI4MHELM
296
125
0
13 May 2025
Large Language Models for Computer-Aided Design: A Survey
Large Language Models for Computer-Aided Design: A Survey
Licheng Zhang
Bach Le
Naveed Akhtar
Siew-Kei Lam
Tuan Ngo
3DVAI4CE
391
9
0
13 May 2025
ToolACE-DEV: Self-Improving Tool Learning via Decomposition and EVolution
ToolACE-DEV: Self-Improving Tool Learning via Decomposition and EVolution
Xiaolin Huang
Weiwen Liu
Xingshan Zeng
Yanhua Huang
Xinlong Hao
...
Yirong Zeng
Chuhan Wu
Yun Wang
Ruiming Tang
Defu Lian
KELM
377
4
0
12 May 2025
Agent RL Scaling Law: Agent RL with Spontaneous Code Execution for Mathematical Problem Solving
Agent RL Scaling Law: Agent RL with Spontaneous Code Execution for Mathematical Problem Solving
Xinji Mai
Haotian Xu
Zhong-Zhi Li
X. Wu
Weinong Wang
J. Hu
Yingying Zhang
Wenqiang Zhang
ReLMLRM
528
37
0
12 May 2025
Towards Artificial General or Personalized Intelligence? A Survey on Foundation Models for Personalized Federated Intelligence
Towards Artificial General or Personalized Intelligence? A Survey on Foundation Models for Personalized Federated Intelligence
Yu Qiao
Huy Q. Le
Avi Deb Raha
Phuong-Nam Tran
Apurba Adhikary
Mengchun Zhang
Loc X. Nguyen
Eui-nam Huh
Zhu Han
Choong Seon Hong
AI4CE
401
5
0
11 May 2025
Latent Preference Coding: Aligning Large Language Models via Discrete Latent Codes
Latent Preference Coding: Aligning Large Language Models via Discrete Latent Codes
Zhuocheng Gong
Jian Guan
Wei Wu
Huishuai Zhang
Dongyan Zhao
342
4
0
08 May 2025
Advancing and Benchmarking Personalized Tool Invocation for LLMs
Advancing and Benchmarking Personalized Tool Invocation for LLMs
Xiaolin Huang
Yuefeng Huang
Wen Liu
Xingshan Zeng
Yijiao Wang
Ruiming Tang
Hong Xie
Defu Lian
248
2
0
07 May 2025
Assessing and Enhancing the Robustness of LLM-based Multi-Agent Systems Through Chaos Engineering
Assessing and Enhancing the Robustness of LLM-based Multi-Agent Systems Through Chaos Engineering
Joshua Owotogbe
LLMAG
278
6
0
06 May 2025
Soft Best-of-n Sampling for Model Alignment
Soft Best-of-n Sampling for Model Alignment
C. M. Verdun
Alex Oesterling
Himabindu Lakkaraju
Flavio du Pin Calmon
BDL
861
7
0
06 May 2025
RAG-MCP: Mitigating Prompt Bloat in LLM Tool Selection via Retrieval-Augmented Generation
RAG-MCP: Mitigating Prompt Bloat in LLM Tool Selection via Retrieval-Augmented Generation
Tiantian Gan
Qiyao Sun
78
15
0
06 May 2025
A Survey on Progress in LLM Alignment from the Perspective of Reward Design
A Survey on Progress in LLM Alignment from the Perspective of Reward Design
Miaomiao Ji
Yanqiu Wu
Zhibin Wu
Shoujin Wang
Jian Yang
Mark Dras
Usman Naseem
373
9
0
05 May 2025
Sailing by the Stars: A Survey on Reward Models and Learning Strategies for Learning from Rewards
Sailing by the Stars: A Survey on Reward Models and Learning Strategies for Learning from Rewards
Xiaobao Wu
LRM
581
8
0
05 May 2025
Open Challenges in Multi-Agent Security: Towards Secure Systems of Interacting AI Agents
Open Challenges in Multi-Agent Security: Towards Secure Systems of Interacting AI Agents
Christian Schroeder de Witt
AAMLAI4CE
1.1K
35
0
04 May 2025
Visual Test-time Scaling for GUI Agent Grounding
Visual Test-time Scaling for GUI Agent Grounding
Tiange Luo
Lajanugen Logeswaran
Justin Johnson
Honglak Lee
375
10
0
01 May 2025
Coral Protocol: Open Infrastructure Connecting The Internet of Agents
Coral Protocol: Open Infrastructure Connecting The Internet of Agents
Roman J. Georgio
Caelum Forder
Suman Deb
Andri Rahimov
Peter Carroll
Önder Gürcan
402
0
0
30 Apr 2025
A Domain-Agnostic Scalable AI Safety Ensuring Framework
A Domain-Agnostic Scalable AI Safety Ensuring Framework
Beomjun Kim
Kangyeon Kim
Sunwoo Kim
Yeonsang Shin
Heejin Ahn
630
0
0
29 Apr 2025
BrowseComp-ZH: Benchmarking Web Browsing Ability of Large Language Models in Chinese
BrowseComp-ZH: Benchmarking Web Browsing Ability of Large Language Models in Chinese
Peilin Zhou
Bruce Leon
Xiang Ying
Chen Zhang
Yifan Shao
...
Sixin Hong
J. Ren
Jian Chen
Chao-Hong Liu
Yining Hua
RALMELMLRM
356
54
0
27 Apr 2025
AndroidGen: Building an Android Language Agent under Data Scarcity
AndroidGen: Building an Android Language Agent under Data ScarcityAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Hanyu Lai
Junjie Gao
Xiao-Yang Liu
Zifei Shan
Shanghang Zhang
Yuxiao Dong
Jie Tang
LLMAG
320
5
0
27 Apr 2025
Nemotron-Research-Tool-N1: Exploring Tool-Using Language Models with Reinforced Reasoning
Nemotron-Research-Tool-N1: Exploring Tool-Using Language Models with Reinforced Reasoning
Shaokun Zhang
Yi Dong
Jieyu Zhang
Jan Kautz
Bryan Catanzaro
Andrew Tao
Qingyun Wu
Zhiding Yu
Guilin Liu
LLMAGOffRLKELMLRM
511
0
0
25 Apr 2025
TTRL: Test-Time Reinforcement Learning
TTRL: Test-Time Reinforcement Learning
Yuxin Zuo
Kaiyan Zhang
Li Sheng
Li Sheng
Xuekai Zhu
...
Youbang Sun
Zhiyuan Ma
Lifan Yuan
Ning Ding
Bowen Zhou
OffRL
1.3K
117
0
22 Apr 2025
Establishing Reliability Metrics for Reward Models in Large Language Models
Establishing Reliability Metrics for Reward Models in Large Language Models
Yizhou Chen
Yawen Liu
Xuesi Wang
Qingtao Yu
Guangda Huzhang
Anxiang Zeng
Han Yu
Zhiming Zhou
287
1
0
21 Apr 2025
a1: Steep Test-time Scaling Law via Environment Augmented Generation
a1: Steep Test-time Scaling Law via Environment Augmented Generation
Shansong Liu
Shenghua Liu
Yiwei Wang
Baolong Bi
Yuyao Ge
Jun Wan
Yurong Wu
Xueqi Cheng
LRM
295
11
0
20 Apr 2025
Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo
Syntactic and Semantic Control of Large Language Models via Sequential Monte CarloInternational Conference on Learning Representations (ICLR), 2025
João Loula
Benjamin LeBrun
Li Du
Ben Lipkin
Clemente Pasti
...
Ryan Cotterel
Vikash K. Mansinghka
Alexander K. Lew
Tim Vieira
Timothy J. O'Donnell
544
22
0
17 Apr 2025
Memorization vs. Reasoning: Updating LLMs with New Knowledge
Memorization vs. Reasoning: Updating LLMs with New KnowledgeAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Aochong Oliver Li
Tanya Goyal
KELM
351
9
0
16 Apr 2025
BrowseComp: A Simple Yet Challenging Benchmark for Browsing Agents
BrowseComp: A Simple Yet Challenging Benchmark for Browsing Agents
Jason W. Wei
Zhiqing Sun
Spencer Papay
S. McKinney
Jeffrey Han
Isa Fulford
Hyung Won Chung
Alex Tachard Passos
W. Fedus
Amelia Glaese
274
206
0
16 Apr 2025
Offline Learning and Forgetting for Reasoning with Large Language Models
Offline Learning and Forgetting for Reasoning with Large Language Models
Tianwei Ni
Allen Nie
Sapana Chaudhary
Yao Liu
Huzefa Rangwala
Rasool Fakoor
ReLMCLLLRM
1.2K
1
0
15 Apr 2025
Learning from Reference Answers: Versatile Language Model Alignment without Binary Human Preference Data
Learning from Reference Answers: Versatile Language Model Alignment without Binary Human Preference Data
Shuai Zhao
Linchao Zhu
Yi Yang
Yi Yang
456
4
0
14 Apr 2025
DeepTrans: Deep Reasoning Translation via Reinforcement Learning
DeepTrans: Deep Reasoning Translation via Reinforcement Learning
Jiaan Wang
Fandong Meng
Jie Zhou
OffRLLRM
459
1
0
14 Apr 2025
AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories
AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories
Xing Han Lù
Amirhossein Kazemnejad
Nicholas Meade
Arkil Patel
Dongchan Shin
Alejandra Zambrano
Karolina Stañczak
Peter Shaw
Christopher Pal
Siva Reddy
LLMAG
360
17
0
11 Apr 2025
TALE: A Tool-Augmented Framework for Reference-Free Evaluation of Large Language Models
TALE: A Tool-Augmented Framework for Reference-Free Evaluation of Large Language Models
Sher Badshah
Ali Emami
Hassan Sajjad
LLMAGELM
373
1
0
10 Apr 2025
Truthful or Fabricated? Using Causal Attribution to Mitigate Reward Hacking in Explanations
Truthful or Fabricated? Using Causal Attribution to Mitigate Reward Hacking in Explanations
Pedro Ferreira
Wilker Aziz
Ivan Titov
LRM
359
4
0
07 Apr 2025
Fast Controlled Generation from Language Models with Adaptive Weighted Rejection Sampling
Fast Controlled Generation from Language Models with Adaptive Weighted Rejection Sampling
Benjamin Lipkin
Benjamin LeBrun
Jacob Hoover Vigly
João Loula
David R. MacIver
...
Robert Bamler
Vikash K. Mansinghka
Timothy J. O'Donnell
Alexander K. Lew
Tim Vieira
373
6
0
07 Apr 2025
Building LLM Agents by Incorporating Insights from Computer Systems
Building LLM Agents by Incorporating Insights from Computer Systems
Yapeng Mi
Zhi Gao
Xiaojian Ma
Qing Li
LLMAG
363
1
0
06 Apr 2025
The Use of Gaze-Derived Confidence of Inferred Operator Intent in Adjusting Safety-Conscious Haptic Assistance
The Use of Gaze-Derived Confidence of Inferred Operator Intent in Adjusting Safety-Conscious Haptic Assistance
Jeremy D. Webb
Michael Bowman
Songpo Li
Xiaoli Zhang
322
0
0
04 Apr 2025
On the Role of Feedback in Test-Time Scaling of Agentic AI Workflows
On the Role of Feedback in Test-Time Scaling of Agentic AI Workflows
Souradip Chakraborty
Mohammadreza Pourreza
Ruoxi Sun
Yiwen Song
Nino Scherrer
...
Furong Huang
Amrit Singh Bedi
Ahmad Beirami
Hamid Palangi
Tomas Pfister
546
2
0
02 Apr 2025
HERA: Hybrid Edge-cloud Resource Allocation for Cost-Efficient AI Agents
HERA: Hybrid Edge-cloud Resource Allocation for Cost-Efficient AI Agents
Shiyi Liu
Haiying Shen
Shuai Che
Mahdi Ghandi
Mingqin Li
LLMAG
361
4
0
01 Apr 2025
Exploring the Roles of Large Language Models in Reshaping Transportation Systems: A Survey, Framework, and Roadmap
Exploring the Roles of Large Language Models in Reshaping Transportation Systems: A Survey, Framework, and Roadmap
Tong Nie
Jian Sun
Wei Ma
564
23
0
27 Mar 2025
Collab: Controlled Decoding using Mixture of Agents for LLM Alignment
Collab: Controlled Decoding using Mixture of Agents for LLM AlignmentInternational Conference on Learning Representations (ICLR), 2025
Souradip Chakraborty
Sujay Bhatt
Udari Madhushani Sehwag
Soumya Suvra Ghosal
Jiahao Qiu
Mengdi Wang
Dinesh Manocha
Furong Huang
Alec Koppel
Sumitra Ganesh
368
16
0
27 Mar 2025
3DGen-Bench: Comprehensive Benchmark Suite for 3D Generative Models
3DGen-Bench: Comprehensive Benchmark Suite for 3D Generative Models
Yujiao Shi
Mengchen Zhang
Tong Wu
Tengfei Wang
Gordon Wetzstein
Dahua Lin
Yu Qiao
ELM
600
6
0
27 Mar 2025
debug-gym: A Text-Based Environment for Interactive Debugging
debug-gym: A Text-Based Environment for Interactive Debugging
Xingdi Yuan
Morgane M Moss
Charbel El Feghali
Chinmay Singh
Darya Moldavskaya
...
Lucas Caccia
Matheus Pereira
Minseon Kim
Alessandro Sordoni
Marc-Alexandre Côté
LLMAG
283
4
0
27 Mar 2025
MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search
MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search
Yunhai Hu
Yilun Zhao
Chen Zhao
Arman Cohan
ReLMLRM
352
3
0
26 Mar 2025
OmniNova:A General Multimodal Agent Framework
OmniNova:A General Multimodal Agent Framework
Pengfei Du
LLMAG
213
0
0
25 Mar 2025
Previous
123456...212223
Next