ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.09332
  4. Cited By
WebGPT: Browser-assisted question-answering with human feedback
v1v2v3 (latest)

WebGPT: Browser-assisted question-answering with human feedback

17 December 2021
Reiichiro Nakano
Jacob Hilton
S. Balaji
Jeff Wu
Ouyang Long
Christina Kim
Christopher Hesse
Shantanu Jain
V. Kosaraju
William Saunders
Xu Jiang
K. Cobbe
Tyna Eloundou
Gretchen Krueger
Kevin Button
Matthew Knight
B. Chess
John Schulman
    ALMRALM
ArXiv (abs)PDFHTMLHuggingFace (2 upvotes)

Papers citing "WebGPT: Browser-assisted question-answering with human feedback"

50 / 1,125 papers shown
Strong hallucinations from negation and how to fix them
Strong hallucinations from negation and how to fix them
Nicholas Asher
Swarnadeep Bhar
ReLMLRM
189
10
0
16 Feb 2024
A Trembling House of Cards? Mapping Adversarial Attacks against Language
  Agents
A Trembling House of Cards? Mapping Adversarial Attacks against Language Agents
Lingbo Mo
Zeyi Liao
Boyuan Zheng
Yu-Chuan Su
Chaowei Xiao
Huan Sun
AAMLLLMAG
293
23
0
15 Feb 2024
RS-DPO: A Hybrid Rejection Sampling and Direct Preference Optimization
  Method for Alignment of Large Language Models
RS-DPO: A Hybrid Rejection Sampling and Direct Preference Optimization Method for Alignment of Large Language Models
Saeed Khaki
JinJin Li
Lan Ma
Liu Yang
Prathap Ramachandra
262
38
0
15 Feb 2024
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts
Kuang-Huei Lee
Xinyun Chen
Hiroki Furuta
John F. Canny
Ian S. Fischer
RALM
258
81
0
15 Feb 2024
InfoRM: Mitigating Reward Hacking in RLHF via Information-Theoretic
  Reward Modeling
InfoRM: Mitigating Reward Hacking in RLHF via Information-Theoretic Reward Modeling
Yuchun Miao
Sen Zhang
Liang Ding
Rong Bao
Lefei Zhang
Dacheng Tao
363
58
0
14 Feb 2024
Learning Interpretable Concepts: Unifying Causal Representation Learning
  and Foundation Models
Learning Interpretable Concepts: Unifying Causal Representation Learning and Foundation Models
Goutham Rajendran
Simon Buchholz
Bryon Aragam
Bernhard Schölkopf
Pradeep Ravikumar
AI4CE
430
31
0
14 Feb 2024
Tell Me More! Towards Implicit User Intention Understanding of Language
  Model Driven Agents
Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents
Cheng Qian
Bingxiang He
Zhuang Zhong
Jia Deng
Yujia Qin
...
Zhong Zhang
Jie Zhou
Yankai Lin
Zhiyuan Liu
Maosong Sun
240
61
0
14 Feb 2024
Discovering Sensorimotor Agency in Cellular Automata using Diversity
  Search
Discovering Sensorimotor Agency in Cellular Automata using Diversity Search
Gautier Hamon
Mayalen Etcheverry
B. Chan
Clément Moulin-Frier
Pierre-Yves Oudeyer
AI4CE
244
9
0
14 Feb 2024
AgentLens: Visual Analysis for Agent Behaviors in LLM-based Autonomous
  Systems
AgentLens: Visual Analysis for Agent Behaviors in LLM-based Autonomous Systems
Jiaying Lu
Bo Pan
Jieyi Chen
Yingchaojie Feng
Jingyuan Hu
Yuchen Peng
Wei Chen
183
25
0
14 Feb 2024
Measuring and Controlling Instruction (In)Stability in Language Model
  Dialogs
Measuring and Controlling Instruction (In)Stability in Language Model Dialogs
Kenneth Li
Tianle Liu
Naomi Bashkansky
David Bau
Fernanda Viégas
Hanspeter Pfister
Martin Wattenberg
362
24
0
13 Feb 2024
PRDP: Proximal Reward Difference Prediction for Large-Scale Reward
  Finetuning of Diffusion Models
PRDP: Proximal Reward Difference Prediction for Large-Scale Reward Finetuning of Diffusion Models
Fei Deng
Qifei Wang
Wei Wei
Matthias Grundmann
Tingbo Hou
EGVM
319
33
0
13 Feb 2024
ODIN: Disentangled Reward Mitigates Hacking in RLHF
ODIN: Disentangled Reward Mitigates Hacking in RLHFInternational Conference on Machine Learning (ICML), 2024
Lichang Chen
Chen Zhu
Davit Soselia
Jiuhai Chen
Wanrong Zhu
Tom Goldstein
Heng-Chiao Huang
Mohammad Shoeybi
Bryan Catanzaro
AAML
312
107
0
11 Feb 2024
Online Iterative Reinforcement Learning from Human Feedback with General
  Preference Model
Online Iterative Reinforcement Learning from Human Feedback with General Preference ModelNeural Information Processing Systems (NeurIPS), 2024
Chen Ye
Wei Xiong
Yuheng Zhang
Nan Jiang
Tong Zhang
OffRL
299
31
0
11 Feb 2024
How do Large Language Models Navigate Conflicts between Honesty and
  Helpfulness?
How do Large Language Models Navigate Conflicts between Honesty and Helpfulness?International Conference on Machine Learning (ICML), 2024
Ryan Liu
T. Sumers
Ishita Dasgupta
Thomas Griffiths
LLMAG
151
28
0
11 Feb 2024
ScreenAgent: A Vision Language Model-driven Computer Control Agent
ScreenAgent: A Vision Language Model-driven Computer Control Agent
Runliang Niu
Jindong Li
Shiqi Wang
Yali Fu
Xiyu Hu
Xueyuan Leng
He Kong
Yi Chang
Zhiqiang Zhang
LLMAGMLLMLM&Ro
317
83
0
09 Feb 2024
Large Language Models: A Survey
Large Language Models: A Survey
Shervin Minaee
Tomas Mikolov
Narjes Nikzad
M. Asgari-Chenaghlu
R. Socher
Xavier Amatriain
Jianfeng Gao
ALMLM&MAELM
856
789
0
09 Feb 2024
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
Xing Han Lù
Zdeněk Kasner
Siva Reddy
310
119
0
08 Feb 2024
Training Large Language Models for Reasoning through Reverse Curriculum
  Reinforcement Learning
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning
Zhiheng Xi
Wenxiang Chen
Boyang Hong
Senjie Jin
Rui Zheng
...
Xinbo Zhang
Yang Liu
Tao Gui
Tao Gui
Xuanjing Huang
LRM
212
56
0
08 Feb 2024
Merging Facts, Crafting Fallacies: Evaluating the Contradictory Nature
  of Aggregated Factual Claims in Long-Form Generations
Merging Facts, Crafting Fallacies: Evaluating the Contradictory Nature of Aggregated Factual Claims in Long-Form Generations
Cheng-Han Chiang
Hung-yi Lee
HILM
314
13
0
08 Feb 2024
JailbreakRadar: Comprehensive Assessment of Jailbreak Attacks Against LLMs
JailbreakRadar: Comprehensive Assessment of Jailbreak Attacks Against LLMs
Junjie Chu
Yugeng Liu
Ziqing Yang
Xinyue Shen
Michael Backes
Yang Zhang
AAML
353
73
0
08 Feb 2024
Pedagogical Alignment of Large Language Models
Pedagogical Alignment of Large Language Models
Shashank Sonkar
Kangqi Ni
Sapana Chaudhary
Richard G. Baraniuk
AI4Ed
155
24
0
07 Feb 2024
InfLLM: Training-Free Long-Context Extrapolation for LLMs with an
  Efficient Context Memory
InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory
Chaojun Xiao
Pengle Zhang
Xu Han
Guangxuan Xiao
Yankai Lin
Zhengyan Zhang
Zhiyuan Liu
Maosong Sun
LLMAG
349
111
0
07 Feb 2024
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Gilles Baechler
Srinivas Sunkara
Maria Wang
Fedir Zubach
Hassan Mansoor
Vincent Etter
Victor Carbune
Jason Lin
Jindong Chen
Abhanshu Sharma
871
98
0
07 Feb 2024
Training Language Models to Generate Text with Citations via
  Fine-grained Rewards
Training Language Models to Generate Text with Citations via Fine-grained Rewards
Chengyu Huang
Zeqiu Wu
Yushi Hu
Wenya Wang
HILMLRM
274
42
0
06 Feb 2024
Personalized Language Modeling from Personalized Human Feedback
Personalized Language Modeling from Personalized Human Feedback
Xinyu Li
Zachary C. Lipton
Liu Leqi
ALM
429
106
0
06 Feb 2024
V-IRL: Grounding Virtual Intelligence in Real Life
V-IRL: Grounding Virtual Intelligence in Real LifeEuropean Conference on Computer Vision (ECCV), 2024
Jihan Yang
Runyu Ding
Ellis L Brown
Xiaojuan Qi
Saining Xie
LM&Ro
315
35
0
05 Feb 2024
Factuality of Large Language Models in the Year 2024
Factuality of Large Language Models in the Year 2024
Yuxia Wang
Minghan Wang
Muhammad Arslan Manzoor
Fei Liu
Georgi Georgiev
Rocktim Jyoti Das
Preslav Nakov
LRMHILM
219
7
0
04 Feb 2024
Enhance Reasoning for Large Language Models in the Game Werewolf
Enhance Reasoning for Large Language Models in the Game Werewolf
Shuang Wu
Liwen Zhu
Tao Yang
Shiwei Xu
Qiang Fu
Yang Wei
Haobo Fu
LRMLLMAG
322
32
0
04 Feb 2024
Affordable Generative Agents
Affordable Generative Agents
Yangbin Yu
Qin Zhang
Junyou Li
Qiang Fu
Deheng Ye
LLMAGAI4CE
313
9
0
03 Feb 2024
How well do LLMs cite relevant medical references? An evaluation
  framework and analyses
How well do LLMs cite relevant medical references? An evaluation framework and analyses
Kevin Wu
Eric Wu
Ally Cassasola
Angela Zhang
Kevin Wei
Teresa Nguyen
Sith Riantawan
Patricia Shi Riantawan
Mark A. Lemley
James Zou
LM&MAELMAI4MH
278
43
0
03 Feb 2024
TravelPlanner: A Benchmark for Real-World Planning with Language Agents
TravelPlanner: A Benchmark for Real-World Planning with Language Agents
Jian Xie
Kai Zhang
Jiangjie Chen
Tinghui Zhu
Renze Lou
Yuandong Tian
Yanghua Xiao
Yu-Chuan Su
LLMAGLM&Ro
329
308
0
02 Feb 2024
Building Guardrails for Large Language Models
Building Guardrails for Large Language Models
Yizhen Dong
Ronghui Mu
Gao Jin
Yi Qi
Jinwei Hu
Xingyu Zhao
Jie Meng
Wenjie Ruan
Xiaowei Huang
OffRL
417
69
0
02 Feb 2024
AMOR: A Recipe for Building Adaptable Modular Knowledge Agents Through
  Process Feedback
AMOR: A Recipe for Building Adaptable Modular Knowledge Agents Through Process Feedback
Jian Guan
Wei Wu
Zujie Wen
Peng Xu
Hongning Wang
Shiyu Huang
LRM
201
31
0
02 Feb 2024
Rethinking the Role of Proxy Rewards in Language Model Alignment
Rethinking the Role of Proxy Rewards in Language Model Alignment
Sungdong Kim
Minjoon Seo
SyDaALM
278
5
0
02 Feb 2024
LLM-based NLG Evaluation: Current Status and Challenges
LLM-based NLG Evaluation: Current Status and Challenges
Mingqi Gao
Xinyu Hu
Jie Ruan
Xiao Pu
Xiaojun Wan
ELMLM&MA
627
96
0
02 Feb 2024
Plan-Grounded Large Language Models for Dual Goal Conversational
  Settings
Plan-Grounded Large Language Models for Dual Goal Conversational Settings
Diogo Glória-Silva
Rafael Ferreira
Diogo Tavares
David Semedo
João Magalhães
LLMAG
178
6
0
01 Feb 2024
Executable Code Actions Elicit Better LLM Agents
Executable Code Actions Elicit Better LLM Agents
Xingyao Wang
Yangyi Chen
Lifan Yuan
Yizhe Zhang
Yunzhu Li
Yuan Yao
Heng Ji
ELMLLMAGLM&Ro
873
334
0
01 Feb 2024
Don't Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM
  Collaboration
Don't Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM Collaboration
Shangbin Feng
Weijia Shi
Yike Wang
Wenxuan Ding
Vidhisha Balachandran
Yulia Tsvetkov
351
168
0
01 Feb 2024
Efficient Non-Parametric Uncertainty Quantification for Black-Box Large
  Language Models and Decision Planning
Efficient Non-Parametric Uncertainty Quantification for Black-Box Large Language Models and Decision Planning
Yao-Hung Tsai
Walter Talbott
Jian Zhang
LLMAG
235
11
0
01 Feb 2024
Iterative Data Smoothing: Mitigating Reward Overfitting and
  Overoptimization in RLHF
Iterative Data Smoothing: Mitigating Reward Overfitting and Overoptimization in RLHF
Banghua Zhu
Michael I. Jordan
Jiantao Jiao
293
47
0
29 Jan 2024
PROXYQA: An Alternative Framework for Evaluating Long-Form Text
  Generation with Large Language Models
PROXYQA: An Alternative Framework for Evaluating Long-Form Text Generation with Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Haochen Tan
Zhijiang Guo
Zhan Shi
Lu Xu
Zhili Liu
...
Xiaoguang Li
Yasheng Wang
Lifeng Shang
Qun Liu
Linqi Song
277
19
0
26 Jan 2024
Can LLMs Evaluate Complex Attribution in QA? Automatic Benchmarking using Knowledge Graphs
Can LLMs Evaluate Complex Attribution in QA? Automatic Benchmarking using Knowledge GraphsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Nan Hu
Jiaoyan Chen
Yike Wu
Guilin Qi
Hongru Wang
Sheng Bi
Yongrui Chen
Tongtong Wu
Jeff Z. Pan
HILM
392
8
0
26 Jan 2024
WebVoyager: Building an End-to-End Web Agent with Large Multimodal
  Models
WebVoyager: Building an End-to-End Web Agent with Large Multimodal ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Hongliang He
Wenlin Yao
Kaixin Ma
Wenhao Yu
Yong Dai
Hongming Zhang
Zhenzhong Lan
Dong Yu
LLMAG
509
250
0
25 Jan 2024
UniMS-RAG: A Unified Multi-source Retrieval-Augmented Generation for
  Personalized Dialogue Systems
UniMS-RAG: A Unified Multi-source Retrieval-Augmented Generation for Personalized Dialogue Systems
Hongru Wang
Wenyu Huang
Yang Deng
Rui Wang
Zezhong Wang
Yufei Wang
Fei Mi
Jeff Z. Pan
Kam-Fai Wong
RALM
299
50
0
24 Jan 2024
AgentBoard: An Analytical Evaluation Board of Multi-turn LLM Agents
AgentBoard: An Analytical Evaluation Board of Multi-turn LLM AgentsNeural Information Processing Systems (NeurIPS), 2024
Chang Ma
Junlei Zhang
Zhihao Zhu
Cheng Yang
Yujiu Yang
Yaohui Jin
Zhenzhong Lan
Lingpeng Kong
Junxian He
ELMLLMAG
237
133
0
24 Jan 2024
ARGS: Alignment as Reward-Guided Search
ARGS: Alignment as Reward-Guided SearchInternational Conference on Learning Representations (ICLR), 2024
Maxim Khanov
Jirayu Burapacheep
Yixuan Li
456
95
0
23 Jan 2024
Linear Alignment: A Closed-form Solution for Aligning Human Preferences
  without Tuning and Feedback
Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and FeedbackInternational Conference on Machine Learning (ICML), 2024
Songyang Gao
Qiming Ge
Wei Shen
Jiajun Sun
Junjie Ye
...
Yicheng Zou
Zhi Chen
Hang Yan
Tao Gui
Dahua Lin
246
20
0
21 Jan 2024
Reinforcement learning for question answering in programming domain
  using public community scoring as a human feedback
Reinforcement learning for question answering in programming domain using public community scoring as a human feedback
Alexey Gorbatovski
Sergey Kovalchuk
50
6
0
19 Jan 2024
R-Judge: Benchmarking Safety Risk Awareness for LLM Agents
R-Judge: Benchmarking Safety Risk Awareness for LLM Agents
Tongxin Yuan
Zhiwei He
Lingzhong Dong
Yiming Wang
Ruijie Zhao
...
Binglin Zhou
Fangqi Li
Zhuosheng Zhang
Rui Wang
Gongshen Liu
ELM
414
150
0
18 Jan 2024
QAnswer: Towards Question Answering Search over Websites
QAnswer: Towards Question Answering Search over WebsitesThe Web Conference (WWW), 2022
Kunpeng Guo
Clement Defretiere
Dennis Diefenbach
Christophe Gravier
Antoine Gourru
182
6
0
17 Jan 2024
Previous
123...131415...212223
Next
Page 14 of 23
Pageof 23