ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.09332
  4. Cited By
WebGPT: Browser-assisted question-answering with human feedback

WebGPT: Browser-assisted question-answering with human feedback

17 December 2021
Reiichiro Nakano
Jacob Hilton
S. Balaji
Jeff Wu
Ouyang Long
Christina Kim
Christopher Hesse
Shantanu Jain
V. Kosaraju
William Saunders
Xu Jiang
K. Cobbe
Tyna Eloundou
Gretchen Krueger
Kevin Button
Matthew Knight
B. Chess
John Schulman
    ALM
    RALM
ArXivPDFHTML

Papers citing "WebGPT: Browser-assisted question-answering with human feedback"

50 / 905 papers shown
Title
Harnessing Your DRAM and SSD for Sustainable and Accessible LLM
  Inference with Mixed-Precision and Multi-level Caching
Harnessing Your DRAM and SSD for Sustainable and Accessible LLM Inference with Mixed-Precision and Multi-level Caching
Jie Peng
Zhang Cao
Huaizhi Qu
Zhengyu Zhang
Chang Guo
Yanyong Zhang
Zhichao Cao
Tianlong Chen
34
2
0
17 Oct 2024
Divide-Verify-Refine: Can LLMs Self-Align with Complex Instructions?
Divide-Verify-Refine: Can LLMs Self-Align with Complex Instructions?
Xianren Zhang
Xianfeng Tang
Hui Liu
Zongyu Wu
Qi He
Dongwon Lee
Suhang Wang
ALM
41
0
0
16 Oct 2024
On the Capacity of Citation Generation by Large Language Models
On the Capacity of Citation Generation by Large Language Models
Haosheng Qian
Yixing Fan
Ruqing Zhang
J. Guo
HILM
18
1
0
15 Oct 2024
Ada-K Routing: Boosting the Efficiency of MoE-based LLMs
Ada-K Routing: Boosting the Efficiency of MoE-based LLMs
Tongtian Yue
Longteng Guo
Jie Cheng
Xuange Gao
J. Liu
MoE
25
0
0
14 Oct 2024
MisinfoEval: Generative AI in the Era of "Alternative Facts"
MisinfoEval: Generative AI in the Era of "Alternative Facts"
Saadia Gabriel
Liang Lyu
James Siderius
Marzyeh Ghassemi
Jacob Andreas
Asu Ozdaglar
26
2
0
13 Oct 2024
LINKED: Eliciting, Filtering and Integrating Knowledge in Large Language
  Model for Commonsense Reasoning
LINKED: Eliciting, Filtering and Integrating Knowledge in Large Language Model for Commonsense Reasoning
Jiachun Li
Pengfei Cao
Chenhao Wang
Zhuoran Jin
Yubo Chen
Kang-Jun Liu
Xiaojian Jiang
Jiexin Xu
Jun Zhao
LRM
KELM
34
0
0
12 Oct 2024
Retrieving Contextual Information for Long-Form Question Answering using
  Weak Supervision
Retrieving Contextual Information for Long-Form Question Answering using Weak Supervision
Philipp Christmann
Svitlana Vakulenko
Ionut Teodor Sorodoc
Bill Byrne
Adria de Gispert
RALM
29
0
0
11 Oct 2024
Refusal-Trained LLMs Are Easily Jailbroken As Browser Agents
Refusal-Trained LLMs Are Easily Jailbroken As Browser Agents
Priyanshu Kumar
Elaine Lau
Saranya Vijayakumar
Tu Trinh
Scale Red Team
...
Sean Hendryx
Shuyan Zhou
Matt Fredrikson
Summer Yue
Zifan Wang
LLMAG
34
17
0
11 Oct 2024
Simultaneous Reward Distillation and Preference Learning: Get You a Language Model Who Can Do Both
Simultaneous Reward Distillation and Preference Learning: Get You a Language Model Who Can Do Both
Abhijnan Nath
Changsoo Jung
Ethan Seefried
Nikhil Krishnaswamy
119
1
0
11 Oct 2024
Understanding the Interplay between Parametric and Contextual Knowledge
  for Large Language Models
Understanding the Interplay between Parametric and Contextual Knowledge for Large Language Models
Sitao Cheng
Liangming Pan
Xunjian Yin
Xinyi Wang
William Yang Wang
KELM
37
4
0
10 Oct 2024
Agents Thinking Fast and Slow: A Talker-Reasoner Architecture
Agents Thinking Fast and Slow: A Talker-Reasoner Architecture
Konstantina Christakopoulou
Shibl Mourad
Maja Matarić
LLMAG
31
10
0
10 Oct 2024
Rewarding Progress: Scaling Automated Process Verifiers for LLM
  Reasoning
Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning
Amrith Rajagopal Setlur
Chirag Nagpal
Adam Fisch
Xinyang Geng
Jacob Eisenstein
Rishabh Agarwal
Alekh Agarwal
Jonathan Berant
Aviral Kumar
OffRL
LRM
42
41
0
10 Oct 2024
Steering Masked Discrete Diffusion Models via Discrete Denoising
  Posterior Prediction
Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction
Jarrid Rector-Brooks
Mohsin Hasan
Zhangzhi Peng
Zachary Quinn
Chenghao Liu
...
Michael Bronstein
Yoshua Bengio
Pranam Chatterjee
Alexander Tong
Avishek Joey Bose
DiffM
47
6
0
10 Oct 2024
AppBench: Planning of Multiple APIs from Various APPs for Complex User
  Instruction
AppBench: Planning of Multiple APIs from Various APPs for Complex User Instruction
Hongru Wang
Rui Wang
Boyang Xue
Heming Xia
Jingtao Cao
Zeming Liu
Jeff Z. Pan
Kam-Fai Wong
ALM
30
8
0
10 Oct 2024
From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven Interactions
From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven Interactions
Changle Qu
Sunhao Dai
Xiaochi Wei
Hengyi Cai
Shuaiqiang Wang
Dawei Yin
Jun Xu
Ji-Rong Wen
58
9
0
10 Oct 2024
Uncovering Factor Level Preferences to Improve Human-Model Alignment
Uncovering Factor Level Preferences to Improve Human-Model Alignment
Juhyun Oh
Eunsu Kim
Jiseon Kim
Wenda Xu
Inha Cha
William Yang Wang
Alice H. Oh
32
0
0
09 Oct 2024
Self-Boosting Large Language Models with Synthetic Preference Data
Self-Boosting Large Language Models with Synthetic Preference Data
Qingxiu Dong
Li Dong
Xingxing Zhang
Zhifang Sui
Furu Wei
SyDa
40
6
0
09 Oct 2024
ClickAgent: Enhancing UI Location Capabilities of Autonomous Agents
ClickAgent: Enhancing UI Location Capabilities of Autonomous Agents
Jakub Hoscilowicz
Bartosz Maj
Bartosz Kozakiewicz
Oleksii Tymoshchuk
Artur Janicki
LLMAG
47
5
0
09 Oct 2024
TinyClick: Single-Turn Agent for Empowering GUI Automation
TinyClick: Single-Turn Agent for Empowering GUI Automation
Pawel Pawlowski
Krystian Zawistowski
Wojciech Lapacz
Marcin Skorupa
Adam Wiacek
Sebastien Postansque
Jakub Hoscilowicz
MLLM
LLMAG
LRM
44
6
0
09 Oct 2024
Bridging Today and the Future of Humanity: AI Safety in 2024 and Beyond
Bridging Today and the Future of Humanity: AI Safety in 2024 and Beyond
Shanshan Han
73
1
0
09 Oct 2024
ToolBridge: An Open-Source Dataset to Equip LLMs with External Tool
  Capabilities
ToolBridge: An Open-Source Dataset to Equip LLMs with External Tool Capabilities
Zhenchao Jin
Mengchen Liu
Dongdong Chen
Lingting Zhu
Yunsheng Li
Lequan Yu
KELM
26
0
0
08 Oct 2024
Integrating Planning into Single-Turn Long-Form Text Generation
Integrating Planning into Single-Turn Long-Form Text Generation
Yi Liang
You Wu
Honglei Zhuang
Li Chen
Jiaming Shen
...
Zhen Qin
Sumit Sanghai
Xuanhui Wang
Carl Yang
Michael Bendersky
48
3
0
08 Oct 2024
Retrieving, Rethinking and Revising: The Chain-of-Verification Can
  Improve Retrieval Augmented Generation
Retrieving, Rethinking and Revising: The Chain-of-Verification Can Improve Retrieval Augmented Generation
Bolei He
Nuo Chen
Xinran He
Lingyong Yan
Zhenkai Wei
Jinchang Luo
Zhen-Hua Ling
RALM
LRM
28
1
0
08 Oct 2024
AgentSquare: Automatic LLM Agent Search in Modular Design Space
AgentSquare: Automatic LLM Agent Search in Modular Design Space
Yu Shang
Yu Li
Keyu Zhao
Likai Ma
J. Liu
Fengli Xu
Yong Li
LLMAG
47
9
0
08 Oct 2024
Driving with Regulation: Interpretable Decision-Making for Autonomous Vehicles with Retrieval-Augmented Reasoning via LLM
Driving with Regulation: Interpretable Decision-Making for Autonomous Vehicles with Retrieval-Augmented Reasoning via LLM
Tianhui Cai
Yifan Liu
Zewei Zhou
Haoxuan Ma
Seth Z. Zhao
Zhiwen Wu
Jiaqi Ma
42
7
0
07 Oct 2024
LRHP: Learning Representations for Human Preferences via Preference
  Pairs
LRHP: Learning Representations for Human Preferences via Preference Pairs
Chenglong Wang
Yang Gan
Yifu Huo
Yongyu Mu
Qiaozhi He
Murun Yang
Tong Xiao
Chunliang Zhang
Tongran Liu
Jingbo Zhu
AI4TS
32
0
0
06 Oct 2024
Aligning LLMs with Individual Preferences via Interaction
Aligning LLMs with Individual Preferences via Interaction
Shujin Wu
May Fung
Cheng Qian
Jeonghwan Kim
Dilek Z. Hakkani-Tür
Heng Ji
28
10
0
04 Oct 2024
CodePMP: Scalable Preference Model Pretraining for Large Language Model
  Reasoning
CodePMP: Scalable Preference Model Pretraining for Large Language Model Reasoning
Huimu Yu
Xing Wu
Weidong Yin
Debing Zhang
Songlin Hu
LRM
26
5
0
03 Oct 2024
MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions
MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions
Yekun Chai
Haoran Sun
Huang Fang
Shuohuan Wang
Yu Sun
Hua-Hong Wu
132
1
0
03 Oct 2024
Evaluating Robustness of Reward Models for Mathematical Reasoning
Evaluating Robustness of Reward Models for Mathematical Reasoning
Sunghwan Kim
Dongjin Kang
Taeyoon Kwon
Hyungjoo Chae
Jungsoo Won
Dongha Lee
Jinyoung Yeo
30
4
0
02 Oct 2024
HelpSteer2-Preference: Complementing Ratings with Preferences
HelpSteer2-Preference: Complementing Ratings with Preferences
Zhilin Wang
Alexander Bukharin
Olivier Delalleau
Daniel Egert
Gerald Shen
Jiaqi Zeng
Oleksii Kuchaiev
Yi Dong
ALM
42
39
0
02 Oct 2024
HybridFlow: A Flexible and Efficient RLHF Framework
HybridFlow: A Flexible and Efficient RLHF Framework
Guangming Sheng
Chi Zhang
Zilingfeng Ye
Xibin Wu
Wang Zhang
Ru Zhang
Yanghua Peng
Haibin Lin
Chuan Wu
AI4CE
31
66
0
28 Sep 2024
Open-World Evaluation for Retrieving Diverse Perspectives
Open-World Evaluation for Retrieving Diverse Perspectives
Hung-Ting Chen
Eunsol Choi
35
0
0
26 Sep 2024
Zeroth-Order Policy Gradient for Reinforcement Learning from Human Feedback without Reward Inference
Zeroth-Order Policy Gradient for Reinforcement Learning from Human Feedback without Reward Inference
Qining Zhang
Lei Ying
OffRL
37
2
0
25 Sep 2024
Analyzing Probabilistic Methods for Evaluating Agent Capabilities
Analyzing Probabilistic Methods for Evaluating Agent Capabilities
Axel Højmark
Govind Pimpale
Arjun Panickssery
Marius Hobbhahn
Jérémy Scheurer
18
4
0
24 Sep 2024
LLM With Tools: A Survey
LLM With Tools: A Survey
Zhuocheng Shen
41
9
0
24 Sep 2024
LINKAGE: Listwise Ranking among Varied-Quality References for
  Non-Factoid QA Evaluation via LLMs
LINKAGE: Listwise Ranking among Varied-Quality References for Non-Factoid QA Evaluation via LLMs
Sihui Yang
Keping Bi
Wanqing Cui
Jiafeng Guo
Xueqi Cheng
18
2
0
23 Sep 2024
PROMPTFUZZ: Harnessing Fuzzing Techniques for Robust Testing of Prompt Injection in LLMs
PROMPTFUZZ: Harnessing Fuzzing Techniques for Robust Testing of Prompt Injection in LLMs
Jiahao Yu
Yangguang Shao
Hanwen Miao
Junzheng Shi
SILM
AAML
67
4
0
23 Sep 2024
Backtracking Improves Generation Safety
Backtracking Improves Generation Safety
Yiming Zhang
Jianfeng Chi
Hailey Nguyen
Kartikeya Upasani
Daniel M. Bikel
Jason Weston
Eric Michael Smith
SILM
41
6
0
22 Sep 2024
From Linguistic Giants to Sensory Maestros: A Survey on Cross-Modal
  Reasoning with Large Language Models
From Linguistic Giants to Sensory Maestros: A Survey on Cross-Modal Reasoning with Large Language Models
Shengsheng Qian
Zuyi Zhou
Dizhan Xue
Bing Wang
Changsheng Xu
LRM
34
1
0
19 Sep 2024
Textualized Agent-Style Reasoning for Complex Tasks by Multiple Round
  LLM Generation
Textualized Agent-Style Reasoning for Complex Tasks by Multiple Round LLM Generation
Chen Liang
Zhifan Feng
Zihe Liu
Wenbin Jiang
Jinan Xu
Yufeng Chen
Yong Wang
LLMAG
LRM
25
1
0
19 Sep 2024
From Lists to Emojis: How Format Bias Affects Model Alignment
From Lists to Emojis: How Format Bias Affects Model Alignment
Xuanchang Zhang
Wei Xiong
Lichang Chen
Tianyi Zhou
Heng Huang
Tong Zhang
ALM
33
10
0
18 Sep 2024
CoCA: Regaining Safety-awareness of Multimodal Large Language Models
  with Constitutional Calibration
CoCA: Regaining Safety-awareness of Multimodal Large Language Models with Constitutional Calibration
Jiahui Gao
Renjie Pi
Tianyang Han
Han Wu
Lanqing Hong
Lingpeng Kong
Xin Jiang
Zhenguo Li
41
5
0
17 Sep 2024
Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse
Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse
Maojia Song
Shang Hong Sim
Rishabh Bhardwaj
Hai Leong Chieu
Navonil Majumder
Soujanya Poria
29
6
0
17 Sep 2024
Model-in-the-Loop (MILO): Accelerating Multimodal AI Data Annotation
  with LLMs
Model-in-the-Loop (MILO): Accelerating Multimodal AI Data Annotation with LLMs
Yifan Wang
David Stevens
Pranay Shah
Wenwen Jiang
Miao Liu
...
Boying Gong
Daniel Lee
Jiabo Hu
Ning Zhang
Bob Kamma
37
1
0
16 Sep 2024
StruEdit: Structured Outputs Enable the Fast and Accurate Knowledge
  Editing for Large Language Models
StruEdit: Structured Outputs Enable the Fast and Accurate Knowledge Editing for Large Language Models
Baolong Bi
Shenghua Liu
Yiwei Wang
Lingrui Mei
Hongcheng Gao
Junfeng Fang
Xueqi Cheng
KELM
33
8
0
16 Sep 2024
Trustworthiness in Retrieval-Augmented Generation Systems: A Survey
Trustworthiness in Retrieval-Augmented Generation Systems: A Survey
Yujia Zhou
Yan Liu
Xiaoxi Li
Jiajie Jin
Hongjin Qian
Zheng Liu
Chaozhuo Li
Zhicheng Dou
Tsung-Yi Ho
Philip S. Yu
3DV
RALM
52
27
0
16 Sep 2024
Towards Data-Centric RLHF: Simple Metrics for Preference Dataset
  Comparison
Towards Data-Centric RLHF: Simple Metrics for Preference Dataset Comparison
Judy Hanwen Shen
Archit Sharma
Jun Qin
42
4
0
15 Sep 2024
Alignment of Diffusion Models: Fundamentals, Challenges, and Future
Alignment of Diffusion Models: Fundamentals, Challenges, and Future
Buhua Liu
Shitong Shao
Bao Li
Lichen Bai
Zhiqiang Xu
Haoyi Xiong
James Kwok
Sumi Helal
Zeke Xie
39
11
0
11 Sep 2024
Policy Filtration in RLHF to Fine-Tune LLM for Code Generation
Policy Filtration in RLHF to Fine-Tune LLM for Code Generation
Wei Shen
Chuheng Zhang
OffRL
36
6
0
11 Sep 2024
Previous
12345...171819
Next