ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.09332
  4. Cited By
WebGPT: Browser-assisted question-answering with human feedback

WebGPT: Browser-assisted question-answering with human feedback

17 December 2021
Reiichiro Nakano
Jacob Hilton
S. Balaji
Jeff Wu
Ouyang Long
Christina Kim
Christopher Hesse
Shantanu Jain
V. Kosaraju
William Saunders
Xu Jiang
K. Cobbe
Tyna Eloundou
Gretchen Krueger
Kevin Button
Matthew Knight
B. Chess
John Schulman
    ALM
    RALM
ArXivPDFHTML

Papers citing "WebGPT: Browser-assisted question-answering with human feedback"

50 / 905 papers shown
Title
Tool-Planner: Task Planning with Clusters across Multiple Tools
Tool-Planner: Task Planning with Clusters across Multiple Tools
Yanming Liu
Xinyue Peng
Jiannan Cao
Jiannan Cao
Xuhong Zhang
Sheng Cheng
Xun Wang
Xun Wang
Jianwei Yin
Tianyu Du
LLMAG
37
3
0
06 Jun 2024
HelloFresh: LLM Evaluations on Streams of Real-World Human Editorial
  Actions across X Community Notes and Wikipedia edits
HelloFresh: LLM Evaluations on Streams of Real-World Human Editorial Actions across X Community Notes and Wikipedia edits
Tim Franzmeyer
Aleksandar Shtedritski
Samuel Albanie
Philip H. S. Torr
João F. Henriques
Jakob N. Foerster
27
1
0
05 Jun 2024
Re-ReST: Reflection-Reinforced Self-Training for Language Agents
Re-ReST: Reflection-Reinforced Self-Training for Language Agents
Zi-Yi Dou
Cheng-Fu Yang
Xueqing Wu
Kai-Wei Chang
Nanyun Peng
LRM
88
7
0
03 Jun 2024
BoNBoN Alignment for Large Language Models and the Sweetness of
  Best-of-n Sampling
BoNBoN Alignment for Large Language Models and the Sweetness of Best-of-n Sampling
Lin Gui
Cristina Garbacea
Victor Veitch
BDL
LM&MA
41
36
0
02 Jun 2024
Aligning Language Models with Demonstrated Feedback
Aligning Language Models with Demonstrated Feedback
Omar Shaikh
Michelle S. Lam
Joey Hejna
Yijia Shao
Michael S. Bernstein
Michael S. Bernstein
Diyi Yang
ALM
31
23
0
02 Jun 2024
Unraveling and Mitigating Retriever Inconsistencies in Retrieval-Augmented Large Language Models
Unraveling and Mitigating Retriever Inconsistencies in Retrieval-Augmented Large Language Models
Mingda Li
Xinyu Li
Yifan Chen
Wenfeng Xuan
Weinan Zhang
RALM
31
2
0
31 May 2024
Transfer Q Star: Principled Decoding for LLM Alignment
Transfer Q Star: Principled Decoding for LLM Alignment
Souradip Chakraborty
Soumya Suvra Ghosal
Ming Yin
Dinesh Manocha
Mengdi Wang
Amrit Singh Bedi
Furong Huang
46
24
0
30 May 2024
Group Robust Preference Optimization in Reward-free RLHF
Group Robust Preference Optimization in Reward-free RLHF
Shyam Sundhar Ramesh
Yifan Hu
Iason Chaimalas
Viraj Mehta
Pier Giuseppe Sessa
Haitham Bou-Ammar
Ilija Bogunovic
23
23
0
30 May 2024
TS-Align: A Teacher-Student Collaborative Framework for Scalable
  Iterative Finetuning of Large Language Models
TS-Align: A Teacher-Student Collaborative Framework for Scalable Iterative Finetuning of Large Language Models
Chen Zhang
Chengguang Tang
Dading Chong
Ke Shi
Guohua Tang
Feng Jiang
Haizhou Li
31
4
0
30 May 2024
Learning to Discuss Strategically: A Case Study on One Night Ultimate Werewolf
Learning to Discuss Strategically: A Case Study on One Night Ultimate Werewolf
Xuanfa Jin
Ziyan Wang
Yali Du
Meng Fang
Haifeng Zhang
Jun Wang
OffRL
LLMAG
51
5
0
30 May 2024
Dataflow-Guided Retrieval Augmentation for Repository-Level Code
  Completion
Dataflow-Guided Retrieval Augmentation for Repository-Level Code Completion
Wei Cheng
Yuhan Wu
Wei Hu
30
11
0
30 May 2024
Stress-Testing Capability Elicitation With Password-Locked Models
Stress-Testing Capability Elicitation With Password-Locked Models
Ryan Greenblatt
Fabien Roger
Dmitrii Krasheninnikov
David M. Krueger
32
13
0
29 May 2024
A Multi-Source Retrieval Question Answering Framework Based on RAG
A Multi-Source Retrieval Question Answering Framework Based on RAG
Ridong Wu
Shuhong Chen
Xiangbiao Su
Yuankai Zhu
Yifei Liao
Jianming Wu
22
3
0
29 May 2024
Offline Regularised Reinforcement Learning for Large Language Models
  Alignment
Offline Regularised Reinforcement Learning for Large Language Models Alignment
Pierre Harvey Richemond
Yunhao Tang
Daniel Guo
Daniele Calandriello
M. G. Azar
...
Gil Shamir
Rishabh Joshi
Tianqi Liu
Rémi Munos
Bilal Piot
OffRL
44
21
0
29 May 2024
Evaluating the External and Parametric Knowledge Fusion of Large
  Language Models
Evaluating the External and Parametric Knowledge Fusion of Large Language Models
Hao Zhang
Yuyang Zhang
Xiaoguang Li
Wenxuan Shi
Haonan Xu
...
Yasheng Wang
Lifeng Shang
Qun Liu
Yong-jin Liu
Ruiming Tang
KELM
33
4
0
29 May 2024
CtrlA: Adaptive Retrieval-Augmented Generation via Probe-Guided Control
CtrlA: Adaptive Retrieval-Augmented Generation via Probe-Guided Control
Huanshuo Liu
Hao Zhang
Zhijiang Guo
Kuicai Dong
Xiangyang Li
Yi Quan Lee
Cong Zhang
Yong-jin Liu
3DV
33
6
0
29 May 2024
Aligning to Thousands of Preferences via System Message Generalization
Aligning to Thousands of Preferences via System Message Generalization
Seongyun Lee
Sue Hyun Park
Seungone Kim
Minjoon Seo
ALM
38
36
0
28 May 2024
Tool Learning with Large Language Models: A Survey
Tool Learning with Large Language Models: A Survey
Changle Qu
Sunhao Dai
Xiaochi Wei
Hengyi Cai
Shuaiqiang Wang
Dawei Yin
Jun Xu
Jirong Wen
LLMAG
31
80
0
28 May 2024
M-RAG: Reinforcing Large Language Model Performance through
  Retrieval-Augmented Generation with Multiple Partitions
M-RAG: Reinforcing Large Language Model Performance through Retrieval-Augmented Generation with Multiple Partitions
Zheng Wang
Shu Xian Teo
Jieer Ouyang
Yongjun Xu
Wei Shi
RALM
VLM
27
13
0
26 May 2024
Multi-Reference Preference Optimization for Large Language Models
Multi-Reference Preference Optimization for Large Language Models
Hung Le
Quan Tran
D. Nguyen
Kien Do
Saloni Mittal
Kelechi Ogueji
Svetha Venkatesh
55
0
0
26 May 2024
Tool Learning in the Wild: Empowering Language Models as Automatic Tool Agents
Tool Learning in the Wild: Empowering Language Models as Automatic Tool Agents
Zhengliang Shi
Shen Gao
Xiuyi Chen
Yue Feng
Lingyong Yan
Haibo Shi
Dawei Yin
Zhumin Chen
Suzan Verberne
LLMAG
47
6
0
26 May 2024
AutoManual: Generating Instruction Manuals by LLM Agents via Interactive
  Environmental Learning
AutoManual: Generating Instruction Manuals by LLM Agents via Interactive Environmental Learning
Minghao Chen
Yihang Li
Yanting Yang
Shiyu Yu
Binbin Lin
Xiaofei He
LLMAG
36
0
0
25 May 2024
Learning Generalizable Human Motion Generator with Reinforcement
  Learning
Learning Generalizable Human Motion Generator with Reinforcement Learning
Yunyao Mao
Xiaoyang Liu
Wen-gang Zhou
Zhenbo Lu
Houqiang Li
35
2
0
24 May 2024
A Solution-based LLM API-using Methodology for Academic Information
  Seeking
A Solution-based LLM API-using Methodology for Academic Information Seeking
Yuanchun Wang
Jifan Yu
Zijun Yao
Jing Zhang
Yuyang Xie
...
Yuanyao Li
Huihui Yuan
Lei Hou
Juan-Zi Li
Jie Tang
19
3
0
24 May 2024
Bayesian WeakS-to-Strong from Text Classification to Generation
Bayesian WeakS-to-Strong from Text Classification to Generation
Ziyun Cui
Ziyang Zhang
Wen Wu
Wen Wu
Chao Zhang
31
1
0
24 May 2024
SimPO: Simple Preference Optimization with a Reference-Free Reward
SimPO: Simple Preference Optimization with a Reference-Free Reward
Yu Meng
Mengzhou Xia
Danqi Chen
57
345
0
23 May 2024
LIRE: listwise reward enhancement for preference alignment
LIRE: listwise reward enhancement for preference alignment
Mingye Zhu
Yi Liu
Lei Zhang
Junbo Guo
Zhendong Mao
26
7
0
22 May 2024
The CAP Principle for LLM Serving: A Survey of Long-Context Large
  Language Model Serving
The CAP Principle for LLM Serving: A Survey of Long-Context Large Language Model Serving
Pai Zeng
Zhenyu Ning
Jieru Zhao
Weihao Cui
Mengwei Xu
Liwei Guo
Xusheng Chen
Yizhou Shan
LLMAG
40
4
0
18 May 2024
Generative Artificial Intelligence: A Systematic Review and Applications
Generative Artificial Intelligence: A Systematic Review and Applications
S. S. Sengar
Affan Bin Hasan
Sanjay Kumar
Fiona Carroll
MedIm
28
51
0
17 May 2024
Rethinking ChatGPT's Success: Usability and Cognitive Behaviors Enabled
  by Auto-regressive LLMs' Prompting
Rethinking ChatGPT's Success: Usability and Cognitive Behaviors Enabled by Auto-regressive LLMs' Prompting
Xinzhe Li
Ming Liu
34
0
0
17 May 2024
RLHF Workflow: From Reward Modeling to Online RLHF
RLHF Workflow: From Reward Modeling to Online RLHF
Hanze Dong
Wei Xiong
Bo Pang
Haoxiang Wang
Han Zhao
Yingbo Zhou
Nan Jiang
Doyen Sahoo
Caiming Xiong
Tong Zhang
OffRL
27
95
0
13 May 2024
METAREFLECTION: Learning Instructions for Language Agents using Past
  Reflections
METAREFLECTION: Learning Instructions for Language Agents using Past Reflections
Priyanshu Gupta
Shashank Kirtania
Ananya Singha
Sumit Gulwani
Arjun Radhakrishna
Sherry Shi
Gustavo Soares
LLMAG
32
4
0
13 May 2024
Value Augmented Sampling for Language Model Alignment and
  Personalization
Value Augmented Sampling for Language Model Alignment and Personalization
Seungwook Han
Idan Shenfeld
Akash Srivastava
Yoon Kim
Pulkit Agrawal
OffRL
34
23
0
10 May 2024
Self-Reflection in LLM Agents: Effects on Problem-Solving Performance
Self-Reflection in LLM Agents: Effects on Problem-Solving Performance
Matthew Renze
Erhan Guven
LRM
LLMAG
36
34
0
05 May 2024
Stochastic RAG: End-to-End Retrieval-Augmented Generation through
  Expected Utility Maximization
Stochastic RAG: End-to-End Retrieval-Augmented Generation through Expected Utility Maximization
Hamed Zamani
Michael Bendersky
34
23
0
05 May 2024
Navigating WebAI: Training Agents to Complete Web Tasks with Large
  Language Models and Reinforcement Learning
Navigating WebAI: Training Agents to Complete Web Tasks with Large Language Models and Reinforcement Learning
Lucas-Andrei Thil
Mirela Popa
Gerasimos Spanakis
LLMAG
27
2
0
01 May 2024
Almanac Copilot: Towards Autonomous Electronic Health Record Navigation
Almanac Copilot: Towards Autonomous Electronic Health Record Navigation
C. Zakka
Joseph Cho
Gracia Fahed
R. Shad
Michael Moor
...
Vishnu Ravi
Oliver Aalami
Roxana Daneshjou
Akshay Chaudhari
W. Hiesinger
25
4
0
30 Apr 2024
Towards a Search Engine for Machines: Unified Ranking for Multiple
  Retrieval-Augmented Large Language Models
Towards a Search Engine for Machines: Unified Ranking for Multiple Retrieval-Augmented Large Language Models
Alireza Salemi
Hamed Zamani
36
4
0
30 Apr 2024
When to Retrieve: Teaching LLMs to Utilize Information Retrieval
  Effectively
When to Retrieve: Teaching LLMs to Utilize Information Retrieval Effectively
Tiziano Labruna
Jon Ander Campos
Gorka Azkune
18
9
0
30 Apr 2024
Countering Reward Over-optimization in LLM with Demonstration-Guided
  Reinforcement Learning
Countering Reward Over-optimization in LLM with Demonstration-Guided Reinforcement Learning
Mathieu Rita
Florian Strub
Rahma Chaabouni
Paul Michel
Emmanuel Dupoux
Olivier Pietquin
42
2
0
30 Apr 2024
DPO Meets PPO: Reinforced Token Optimization for RLHF
DPO Meets PPO: Reinforced Token Optimization for RLHF
Han Zhong
Guhao Feng
Guhao Feng
Li Zhao
Di He
Jiang Bian
Liwei Wang
Jiang Bian
Liwei Wang
55
56
0
29 Apr 2024
GPT for Games: A Scoping Review (2020-2023)
GPT for Games: A Scoping Review (2020-2023)
Daijin Yang
Erica Kleinman
Casper Harteveld
AI4TS
AI4CE
31
11
0
27 Apr 2024
Reinforcement Retrieval Leveraging Fine-grained Feedback for Fact
  Checking News Claims with Black-Box LLM
Reinforcement Retrieval Leveraging Fine-grained Feedback for Fact Checking News Claims with Black-Box LLM
Xuan Zhang
Wei Gao
LRM
KELM
27
8
0
26 Apr 2024
REBEL: Reinforcement Learning via Regressing Relative Rewards
REBEL: Reinforcement Learning via Regressing Relative Rewards
Zhaolin Gao
Jonathan D. Chang
Wenhao Zhan
Owen Oertell
Gokul Swamy
Kianté Brantley
Thorsten Joachims
J. Andrew Bagnell
Jason D. Lee
Wen Sun
OffRL
38
31
0
25 Apr 2024
Benchmarking Mobile Device Control Agents across Diverse Configurations
Benchmarking Mobile Device Control Agents across Diverse Configurations
Juyong Lee
Taywon Min
Minyong An
Changyeon Kim
Kimin Lee
36
8
0
25 Apr 2024
Wiki-LLaVA: Hierarchical Retrieval-Augmented Generation for Multimodal
  LLMs
Wiki-LLaVA: Hierarchical Retrieval-Augmented Generation for Multimodal LLMs
Davide Caffagni
Federico Cocchi
Nicholas Moratelli
Sara Sarto
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
KELM
31
35
0
23 Apr 2024
Aligning LLM Agents by Learning Latent Preference from User Edits
Aligning LLM Agents by Learning Latent Preference from User Edits
Ge Gao
Alexey Taymanov
Eduardo Salinas
Paul Mineiro
Dipendra Kumar Misra
LLMAG
37
27
0
23 Apr 2024
From Matching to Generation: A Survey on Generative Information Retrieval
From Matching to Generation: A Survey on Generative Information Retrieval
Xiaoxi Li
Jiajie Jin
Yujia Zhou
Yuyao Zhang
Peitian Zhang
Yutao Zhu
Zhicheng Dou
3DV
67
46
0
23 Apr 2024
Tree of Reviews: A Tree-based Dynamic Iterative Retrieval Framework for
  Multi-hop Question Answering
Tree of Reviews: A Tree-based Dynamic Iterative Retrieval Framework for Multi-hop Question Answering
Jiapeng Li
Runze Liu
Yabo Liu
Tong Zhou
Mingling Li
Xiang Chen
LRM
30
3
0
22 Apr 2024
Filtered Direct Preference Optimization
Filtered Direct Preference Optimization
Tetsuro Morimura
Mitsuki Sakamoto
Yuu Jinnai
Kenshi Abe
Kaito Air
40
13
0
22 Apr 2024
Previous
123...678...171819
Next