ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.09332
  4. Cited By
WebGPT: Browser-assisted question-answering with human feedback
v1v2v3 (latest)

WebGPT: Browser-assisted question-answering with human feedback

17 December 2021
Reiichiro Nakano
Jacob Hilton
S. Balaji
Jeff Wu
Ouyang Long
Christina Kim
Christopher Hesse
Shantanu Jain
V. Kosaraju
William Saunders
Xu Jiang
K. Cobbe
Tyna Eloundou
Gretchen Krueger
Kevin Button
Matthew Knight
B. Chess
John Schulman
    ALMRALM
ArXiv (abs)PDFHTMLHuggingFace (2 upvotes)

Papers citing "WebGPT: Browser-assisted question-answering with human feedback"

50 / 1,125 papers shown
GUICourse: From General Vision Language Models to Versatile GUI Agents
GUICourse: From General Vision Language Models to Versatile GUI Agents
Wentong Chen
Junbo Cui
Jinyi Hu
Yujia Qin
Junjie Fang
...
Yupeng Huo
Yuan Yao
Yankai Lin
Zhiyuan Liu
Maosong Sun
LLMAG
421
94
0
17 Jun 2024
Humor in AI: Massive Scale Crowd-Sourced Preferences and Benchmarks for
  Cartoon Captioning
Humor in AI: Massive Scale Crowd-Sourced Preferences and Benchmarks for Cartoon CaptioningNeural Information Processing Systems (NeurIPS), 2024
Jifan Zhang
Lalit P. Jain
Yang Guo
Jiayi Chen
Kuan Lok Zhou
...
Scott Sievert
Timothy T. Rogers
Kevin Jamieson
Robert Mankoff
Robert Nowak
273
10
0
15 Jun 2024
RoboGolf: Mastering Real-World Minigolf with a Reflective Multi-Modality
  Vision-Language Model
RoboGolf: Mastering Real-World Minigolf with a Reflective Multi-Modality Vision-Language Model
Hantao Zhou
Tianying Ji
Lukas Sommerhalder
Michael Goerner
Norman Hendrich
Jianwei Zhang
Fuchun Sun
Huazhe Xu
604
0
0
14 Jun 2024
PARSE-Ego4D: Personal Action Recommendation Suggestions for Egocentric
  Videos
PARSE-Ego4D: Personal Action Recommendation Suggestions for Egocentric Videos
Steven Abreu
Tiffany D. Do
Ruofei Du
Eric J. Gonzalez
Lee Payne
Daniel J. McDuff
Mar Gonzalez-Franco
310
6
0
14 Jun 2024
HelpSteer2: Open-source dataset for training top-performing reward
  models
HelpSteer2: Open-source dataset for training top-performing reward models
Zhilin Wang
Yi Dong
Olivier Delalleau
Jiaqi Zeng
Gerald Shen
Daniel Egert
Jimmy J. Zhang
Makesh Narsimhan Sreedhar
Oleksii Kuchaiev
AI4TS
315
171
0
12 Jun 2024
It Takes Two: On the Seamlessness between Reward and Policy Model in
  RLHF
It Takes Two: On the Seamlessness between Reward and Policy Model in RLHF
Taiming Lu
Lingfeng Shen
Xinyu Yang
Weiting Tan
Beidi Chen
Huaxiu Yao
344
4
0
12 Jun 2024
A Critical Look At Tokenwise Reward-Guided Text Generation
A Critical Look At Tokenwise Reward-Guided Text Generation
Ahmad Rashid
Ruotian Wu
Julia Grosse
Agustinus Kristiadi
Pascal Poupart
OffRL
615
5
0
12 Jun 2024
OPTune: Efficient Online Preference Tuning
OPTune: Efficient Online Preference Tuning
Lichang Chen
Jiuhai Chen
Chenxi Liu
John Kirchenbauer
Davit Soselia
Chen Zhu
Tom Goldstein
Wanrong Zhu
Heng Huang
130
7
0
11 Jun 2024
Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis
Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis
Qining Zhang
Honghao Wei
Lei Ying
OffRL
420
3
0
11 Jun 2024
CAAP: Context-Aware Action Planning Prompting to Solve Computer Tasks
  with Front-End UI Only
CAAP: Context-Aware Action Planning Prompting to Solve Computer Tasks with Front-End UI Only
Junhee Cho
Jihoon Kim
Daseul Bae
Jinho Choo
Youngjune Gwon
Yeong-Dae Kwon
LLMAG
120
4
0
11 Jun 2024
RS-Agent: Automating Remote Sensing Tasks through Intelligent Agent
RS-Agent: Automating Remote Sensing Tasks through Intelligent Agent
Wenjia Xu
Zijian Yu
Yixu Wang
Jiuniu Wang
Yuanben Zhang
Guangzuo Li
Mugen Peng
LLMAG
472
7
0
11 Jun 2024
The Impact of Quantization on Retrieval-Augmented Generation: An
  Analysis of Small LLMs
The Impact of Quantization on Retrieval-Augmented Generation: An Analysis of Small LLMs
Mert Yazan
Suzan Verberne
F. Situmeang
MQ
184
6
0
10 Jun 2024
Information Theoretic Guarantees For Policy Alignment In Large Language
  Models
Information Theoretic Guarantees For Policy Alignment In Large Language Models
Youssef Mroueh
246
19
0
09 Jun 2024
The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language ModelsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024
Seungone Kim
Juyoung Suk
Ji Yong Cho
Shayne Longpre
Chaeeun Kim
...
Sean Welleck
Graham Neubig
Moontae Lee
Kyungjae Lee
Minjoon Seo
ELMALMLM&MA
432
73
0
09 Jun 2024
CaLM: Contrasting Large and Small Language Models to Verify Grounded
  Generation
CaLM: Contrasting Large and Small Language Models to Verify Grounded GenerationAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
I-Hung Hsu
Zifeng Wang
Long T. Le
Lesly Miculicich
Nanyun Peng
Zifeng Wang
Tomas Pfister
HILM
295
11
0
08 Jun 2024
Benchmark Data Contamination of Large Language Models: A Survey
Benchmark Data Contamination of Large Language Models: A Survey
Cheng Xu
Shuhao Guan
Derek Greene
Mohand-Tahar Kechadi
ELMALM
287
89
0
06 Jun 2024
Prototypical Reward Network for Data-Efficient RLHF
Prototypical Reward Network for Data-Efficient RLHFAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Jinghan Zhang
Xiting Wang
Yiqiao Jin
Changyu Chen
Xinhao Zhang
Kunpeng Liu
ALM
272
27
0
06 Jun 2024
Tool-Planner: Task Planning with Clusters across Multiple Tools
Tool-Planner: Task Planning with Clusters across Multiple Tools
Yanming Liu
Xinyue Peng
Jiannan Cao
Yuwei Zhang
Xuhong Zhang
Sheng Cheng
Xun Wang
Jianwei Yin
Xuhong Zhang
LLMAG
372
2
0
06 Jun 2024
HelloFresh: LLM Evaluations on Streams of Real-World Human Editorial
  Actions across X Community Notes and Wikipedia edits
HelloFresh: LLM Evaluations on Streams of Real-World Human Editorial Actions across X Community Notes and Wikipedia edits
Tim Franzmeyer
Aleksandar Shtedritski
Samuel Albanie
Juil Sock
João F. Henriques
Jakob N. Foerster
223
2
0
05 Jun 2024
Re-ReST: Reflection-Reinforced Self-Training for Language Agents
Re-ReST: Reflection-Reinforced Self-Training for Language Agents
Zi-Yi Dou
Cheng-Fu Yang
Xueqing Wu
Kai-Wei Chang
Nanyun Peng
LRM
574
19
0
03 Jun 2024
BoNBoN Alignment for Large Language Models and the Sweetness of
  Best-of-n Sampling
BoNBoN Alignment for Large Language Models and the Sweetness of Best-of-n Sampling
Lin Gui
Cristina Garbacea
Victor Veitch
BDLLM&MA
477
95
0
02 Jun 2024
Aligning Language Models with Demonstrated Feedback
Aligning Language Models with Demonstrated Feedback
Omar Shaikh
Michelle S. Lam
Joey Hejna
Yijia Shao
Michael S. Bernstein
Michael S. Bernstein
Diyi Yang
ALM
367
11
0
02 Jun 2024
Unraveling and Mitigating Retriever Inconsistencies in Retrieval-Augmented Large Language Models
Unraveling and Mitigating Retriever Inconsistencies in Retrieval-Augmented Large Language Models
Mingda Li
Xinyu Li
Yifan Chen
Wenfeng Xuan
Weinan Zhang
RALM
429
2
0
31 May 2024
Transfer Q Star: Principled Decoding for LLM Alignment
Transfer Q Star: Principled Decoding for LLM Alignment
Souradip Chakraborty
Soumya Suvra Ghosal
Ming Yin
Dinesh Manocha
Mengdi Wang
Amrit Singh Bedi
Furong Huang
282
42
0
30 May 2024
Group Robust Preference Optimization in Reward-free RLHF
Group Robust Preference Optimization in Reward-free RLHF
Shyam Sundhar Ramesh
Yifan Hu
Iason Chaimalas
Viraj Mehta
Pier Giuseppe Sessa
Haitham Bou-Ammar
Ilija Bogunovic
329
87
0
30 May 2024
TS-Align: A Teacher-Student Collaborative Framework for Scalable
  Iterative Finetuning of Large Language Models
TS-Align: A Teacher-Student Collaborative Framework for Scalable Iterative Finetuning of Large Language Models
Chen Zhang
Chengguang Tang
Dading Chong
Ke Shi
Guohua Tang
Feng Jiang
Haizhou Li
214
4
0
30 May 2024
Learning to Discuss Strategically: A Case Study on One Night Ultimate Werewolf
Learning to Discuss Strategically: A Case Study on One Night Ultimate Werewolf
Xuanfa Jin
Ziyan Wang
Yali Du
Meng Fang
Haifeng Zhang
Jun Wang
OffRLLLMAG
440
19
0
30 May 2024
Dataflow-Guided Retrieval Augmentation for Repository-Level Code
  Completion
Dataflow-Guided Retrieval Augmentation for Repository-Level Code Completion
Wei Cheng
Yuhan Wu
Wei Hu
209
38
0
30 May 2024
Stress-Testing Capability Elicitation With Password-Locked Models
Stress-Testing Capability Elicitation With Password-Locked Models
Ryan Greenblatt
Fabien Roger
Dmitrii Krasheninnikov
David M. Krueger
329
25
0
29 May 2024
A Multi-Source Retrieval Question Answering Framework Based on RAG
A Multi-Source Retrieval Question Answering Framework Based on RAG
Ridong Wu
Shuhong Chen
Xiangbiao Su
Yuankai Zhu
Yifei Liao
Jianming Wu
125
7
0
29 May 2024
Offline Regularised Reinforcement Learning for Large Language Models
  Alignment
Offline Regularised Reinforcement Learning for Large Language Models Alignment
Pierre Harvey Richemond
Yunhao Tang
Daniel Guo
Daniele Calandriello
M. G. Azar
...
Gil Shamir
Rishabh Joshi
Tianqi Liu
Rémi Munos
Bilal Piot
OffRL
239
41
0
29 May 2024
Evaluating the External and Parametric Knowledge Fusion of Large
  Language Models
Evaluating the External and Parametric Knowledge Fusion of Large Language Models
Hao Zhang
Yuyang Zhang
Xiaoguang Li
Wenxuan Shi
Haonan Xu
...
Yasheng Wang
Lifeng Shang
Qun Liu
Yong Liu
Ruiming Tang
KELM
250
7
0
29 May 2024
CtrlA: Adaptive Retrieval-Augmented Generation via Probe-Guided Control
CtrlA: Adaptive Retrieval-Augmented Generation via Probe-Guided Control
Huanshuo Liu
Hao Zhang
Zhijiang Guo
Kuicai Dong
Xiangyang Li
Yi Quan Lee
Cong Zhang
Yong Liu
3DV
286
12
0
29 May 2024
Aligning to Thousands of Preferences via System Message Generalization
Aligning to Thousands of Preferences via System Message Generalization
Seongyun Lee
Sue Hyun Park
Seungone Kim
Minjoon Seo
ALM
327
71
0
28 May 2024
Tool Learning with Large Language Models: A Survey
Tool Learning with Large Language Models: A Survey
Changle Qu
Sunhao Dai
Xiaochi Wei
Hengyi Cai
Shuaiqiang Wang
D. Yin
Jun Xu
Jirong Wen
LLMAG
343
217
0
28 May 2024
M-RAG: Reinforcing Large Language Model Performance through
  Retrieval-Augmented Generation with Multiple Partitions
M-RAG: Reinforcing Large Language Model Performance through Retrieval-Augmented Generation with Multiple Partitions
Zheng Wang
Shu Xian Teo
Jieer Ouyang
Yongjun Xu
Wei Shi
RALMVLM
214
45
0
26 May 2024
Multi-Reference Preference Optimization for Large Language Models
Multi-Reference Preference Optimization for Large Language Models
Hung Le
Quan Tran
D. Nguyen
Kien Do
Saloni Mittal
Kelechi Ogueji
Svetha Venkatesh
196
5
0
26 May 2024
Tool Learning in the Wild: Empowering Language Models as Automatic Tool Agents
Tool Learning in the Wild: Empowering Language Models as Automatic Tool Agents
Zhengliang Shi
Shen Gao
Xiuyi Chen
Yue Feng
Lingyong Yan
Haibo Shi
D. Yin
Zhumin Chen
Suzan Verberne
LLMAG
366
6
0
26 May 2024
AutoManual: Generating Instruction Manuals by LLM Agents via Interactive
  Environmental Learning
AutoManual: Generating Instruction Manuals by LLM Agents via Interactive Environmental Learning
Minghao Chen
Yihang Li
Yanting Yang
Shiyu Yu
Binbin Lin
Xiaofei He
LLMAG
287
0
0
25 May 2024
Learning Generalizable Human Motion Generator with Reinforcement
  Learning
Learning Generalizable Human Motion Generator with Reinforcement Learning
Yunyao Mao
Xiaoyang Liu
Wen-gang Zhou
Zhenbo Lu
Houqiang Li
242
7
0
24 May 2024
Bayesian WeakS-to-Strong from Text Classification to Generation
Bayesian WeakS-to-Strong from Text Classification to Generation
Ziyun Cui
Ziyang Zhang
Wen Wu
Wen Wu
Chao Zhang
387
5
0
24 May 2024
SoAy: A Solution-based LLM API-using Methodology for Academic Information Seeking
SoAy: A Solution-based LLM API-using Methodology for Academic Information Seeking
Yuanchun Wang
Jifan Yu
Zijun Yao
Jing Zhang
Yuyang Xie
...
Yuanyao Li
Huihui Yuan
Lei Hou
Juan-Zi Li
Jie Tang
274
10
0
24 May 2024
SimPO: Simple Preference Optimization with a Reference-Free Reward
SimPO: Simple Preference Optimization with a Reference-Free RewardNeural Information Processing Systems (NeurIPS), 2024
Yu Meng
Mengzhou Xia
Danqi Chen
543
791
0
23 May 2024
LIRE: listwise reward enhancement for preference alignment
LIRE: listwise reward enhancement for preference alignment
Mingye Zhu
Yi Liu
Lei Zhang
Junbo Guo
Zhendong Mao
208
8
0
22 May 2024
The CAP Principle for LLM Serving: A Survey of Long-Context Large
  Language Model Serving
The CAP Principle for LLM Serving: A Survey of Long-Context Large Language Model Serving
Pai Zeng
Zhenyu Ning
Jieru Zhao
Weihao Cui
Mengwei Xu
Liwei Guo
Xusheng Chen
Yizhou Shan
LLMAG
298
5
0
18 May 2024
Generative Artificial Intelligence: A Systematic Review and Applications
Generative Artificial Intelligence: A Systematic Review and Applications
S. S. Sengar
Affan Bin Hasan
Sanjay Kumar
Fiona Carroll
MedIm
301
231
0
17 May 2024
Rethinking ChatGPT's Success: Usability and Cognitive Behaviors Enabled
  by Auto-regressive LLMs' Prompting
Rethinking ChatGPT's Success: Usability and Cognitive Behaviors Enabled by Auto-regressive LLMs' Prompting
Xinzhe Li
Ming Liu
251
1
0
17 May 2024
RLHF Workflow: From Reward Modeling to Online RLHF
RLHF Workflow: From Reward Modeling to Online RLHF
Hanze Dong
Wei Xiong
Bo Pang
Haoxiang Wang
Han Zhao
Yingbo Zhou
Nan Jiang
Doyen Sahoo
Caiming Xiong
Tong Zhang
OffRL
274
209
0
13 May 2024
METAREFLECTION: Learning Instructions for Language Agents using Past
  Reflections
METAREFLECTION: Learning Instructions for Language Agents using Past ReflectionsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Priyanshu Gupta
Shashank Kirtania
Ananya Singha
Sumit Gulwani
Arjun Radhakrishna
Sherry Shi
Gustavo Soares
LLMAG
164
16
0
13 May 2024
Value Augmented Sampling for Language Model Alignment and
  Personalization
Value Augmented Sampling for Language Model Alignment and Personalization
Seungwook Han
Idan Shenfeld
Akash Srivastava
Yoon Kim
Pulkit Agrawal
OffRL
248
40
0
10 May 2024
Previous
123...101112...212223
Next
Page 11 of 23
Pageof 23