ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.09332
  4. Cited By
WebGPT: Browser-assisted question-answering with human feedback

WebGPT: Browser-assisted question-answering with human feedback

17 December 2021
Reiichiro Nakano
Jacob Hilton
S. Balaji
Jeff Wu
Ouyang Long
Christina Kim
Christopher Hesse
Shantanu Jain
V. Kosaraju
William Saunders
Xu Jiang
K. Cobbe
Tyna Eloundou
Gretchen Krueger
Kevin Button
Matthew Knight
B. Chess
John Schulman
    ALM
    RALM
ArXivPDFHTML

Papers citing "WebGPT: Browser-assisted question-answering with human feedback"

50 / 905 papers shown
Title
RLHF Can Speak Many Languages: Unlocking Multilingual Preference
  Optimization for LLMs
RLHF Can Speak Many Languages: Unlocking Multilingual Preference Optimization for LLMs
John Dang
Arash Ahmadian
Kelly Marchisio
Julia Kreutzer
A. Ustun
Sara Hooker
37
21
0
02 Jul 2024
Concise and Precise Context Compression for Tool-Using Language Models
Concise and Precise Context Compression for Tool-Using Language Models
Yang Xu
Yunlong Feng
Honglin Mu
Yutai Hou
Yitong Li
...
Zhongyang Li
Dandan Tu
Qingfu Zhu
M. Zhang
Wanxiang Che
LLMAG
20
3
0
02 Jul 2024
LogEval: A Comprehensive Benchmark Suite for Large Language Models In
  Log Analysis
LogEval: A Comprehensive Benchmark Suite for Large Language Models In Log Analysis
Tianyu Cui
Shiyu Ma
Ziang Chen
Tong Xiao
Shimin Tao
...
Changchang Liu
Yuzhe Cai
Weibin Meng
Yongqian Sun
Dan Pei
ELM
22
4
0
02 Jul 2024
Ground Every Sentence: Improving Retrieval-Augmented LLMs with
  Interleaved Reference-Claim Generation
Ground Every Sentence: Improving Retrieval-Augmented LLMs with Interleaved Reference-Claim Generation
Sirui Xia
Xintao Wang
Jiaqing Liang
Yifei Zhang
Weikang Zhou
Jiaji Deng
Fei Yu
Yanghua Xiao
RALM
11
6
0
01 Jul 2024
DogeRM: Equipping Reward Models with Domain Knowledge through Model
  Merging
DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging
Tzu-Han Lin
Chen An Li
Hung-yi Lee
Yun-Nung Chen
VLM
ALM
26
4
0
01 Jul 2024
$\text{Memory}^3$: Language Modeling with Explicit Memory
Memory3\text{Memory}^3Memory3: Language Modeling with Explicit Memory
Hongkang Yang
Zehao Lin
Wenjin Wang
Hao Wu
Zhiyu Li
...
Yu Yu
Kai Chen
Feiyu Xiong
Linpeng Tang
Weinan E
48
11
0
01 Jul 2024
ProductAgent: Benchmarking Conversational Product Search Agent with
  Asking Clarification Questions
ProductAgent: Benchmarking Conversational Product Search Agent with Asking Clarification Questions
Jingheng Ye
Yong Jiang
Xiaobin Wang
Yinghui Li
Yangning Li
Hai-Tao Zheng
Pengjun Xie
Fei Huang
38
2
0
01 Jul 2024
Advancing Process Verification for Large Language Models via Tree-Based
  Preference Learning
Advancing Process Verification for Large Language Models via Tree-Based Preference Learning
Mingqian He
Yongliang Shen
Wenqi Zhang
Zeqi Tan
Weiming Lu
LRM
35
5
0
29 Jun 2024
Applying RLAIF for Code Generation with API-usage in Lightweight LLMs
Applying RLAIF for Code Generation with API-usage in Lightweight LLMs
Sujan Dutta
Sayantan Mahinder
R. Anantha
Bortik Bandyopadhyay
ALM
34
4
0
28 Jun 2024
Scalable and Domain-General Abstractive Proposition Segmentation
Scalable and Domain-General Abstractive Proposition Segmentation
Mohammad Javad Hosseini
Yang Gao
Tim Baumgärtner
Alex Fabrikant
Reinald Kim Amplayo
31
0
0
28 Jun 2024
Lifelong Robot Library Learning: Bootstrapping Composable and
  Generalizable Skills for Embodied Control with Language Models
Lifelong Robot Library Learning: Bootstrapping Composable and Generalizable Skills for Embodied Control with Language Models
Georgios Tziafas
H. Kasaei
KELM
LM&Ro
40
8
0
26 Jun 2024
Not All Preference Pairs Are Created Equal: A Recipe for
  Annotation-Efficient Iterative Preference Learning
Not All Preference Pairs Are Created Equal: A Recipe for Annotation-Efficient Iterative Preference Learning
Sen Yang
Leyang Cui
Deng Cai
Xinting Huang
Shuming Shi
Wai Lam
38
8
0
25 Jun 2024
Reinforcement Learning via Auxiliary Task Distillation
Reinforcement Learning via Auxiliary Task Distillation
Abhinav Harish
Larry Heck
Josiah P. Hanna
Z. Kira
Andrew Szot
31
0
0
24 Jun 2024
Towards Comprehensive Preference Data Collection for Reward Modeling
Towards Comprehensive Preference Data Collection for Reward Modeling
Yulan Hu
Qingyang Li
Sheng Ouyang
Ge Chen
Kaihui Chen
Lijun Mei
Xucheng Ye
Fuzheng Zhang
Yong Liu
SyDa
32
4
0
24 Jun 2024
Cascade Reward Sampling for Efficient Decoding-Time Alignment
Cascade Reward Sampling for Efficient Decoding-Time Alignment
Bolian Li
Yifan Wang
A. Grama
Ruqi Zhang
Ruqi Zhang
AI4TS
47
9
0
24 Jun 2024
LOGIC-LM++: Multi-Step Refinement for Symbolic Formulations
LOGIC-LM++: Multi-Step Refinement for Symbolic Formulations
Shashank Kirtania
Priyanshu Gupta
Arjun Radhakirshna
LRM
30
4
0
22 Jun 2024
Beyond the Turn-Based Game: Enabling Real-Time Conversations with Duplex
  Models
Beyond the Turn-Based Game: Enabling Real-Time Conversations with Duplex Models
Xinrong Zhang
Yingfa Chen
Shengding Hu
Xu Han
Zihang Xu
Yuanwei Xu
Weilin Zhao
Maosong Sun
Zhiyuan Liu
32
9
0
22 Jun 2024
A SMART Mnemonic Sounds like "Glue Tonic": Mixing LLMs with Student
  Feedback to Make Mnemonic Learning Stick
A SMART Mnemonic Sounds like "Glue Tonic": Mixing LLMs with Student Feedback to Make Mnemonic Learning Stick
Nishant Balepur
Matthew Shu
Alexander Hoyle
Alison Robey
Shi Feng
Seraphina Goldfarb-Tarrant
Jordan Boyd-Graber
44
1
0
21 Jun 2024
Hybrid Alignment Training for Large Language Models
Hybrid Alignment Training for Large Language Models
Chenglong Wang
Hang Zhou
Kaiyan Chang
Bei Li
Yongyu Mu
Tong Xiao
Tongran Liu
Jingbo Zhu
35
4
0
21 Jun 2024
GraphReader: Building Graph-based Agent to Enhance Long-Context
  Abilities of Large Language Models
GraphReader: Building Graph-based Agent to Enhance Long-Context Abilities of Large Language Models
Shilong Li
Yancheng He
Hangyu Guo
Xingyuan Bu
Ge Bai
...
Xingwei Qu
Yangguang Li
Wanli Ouyang
Wenbo Su
Bo Zheng
RALM
LLMAG
43
7
0
20 Jun 2024
FoRAG: Factuality-optimized Retrieval Augmented Generation for
  Web-enhanced Long-form Question Answering
FoRAG: Factuality-optimized Retrieval Augmented Generation for Web-enhanced Long-form Question Answering
Tianchi Cai
Zhiwen Tan
Xierui Song
Tao Sun
Jiyan Jiang
Yunqi Xu
Yinger Zhang
Jinjie Gu
27
5
0
19 Jun 2024
Model Internals-based Answer Attribution for Trustworthy
  Retrieval-Augmented Generation
Model Internals-based Answer Attribution for Trustworthy Retrieval-Augmented Generation
Jirui Qi
Gabriele Sarti
Raquel Fernández
Arianna Bisazza
RALM
37
5
0
19 Jun 2024
AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for
  LLM Agents
AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents
Edoardo Debenedetti
Jie Zhang
Mislav Balunović
Luca Beurer-Kellner
Marc Fischer
Florian Tramèr
LLMAG
AAML
48
25
1
19 Jun 2024
APPL: A Prompt Programming Language for Harmonious Integration of
  Programs and Large Language Model Prompts
APPL: A Prompt Programming Language for Harmonious Integration of Programs and Large Language Model Prompts
Honghua Dong
Qidong Su
Yubo Gao
Zhaoyu Li
Yangjun Ruan
Gennady Pekhimenko
Chris J. Maddison
Xujie Si
LLMAG
26
1
0
19 Jun 2024
Learning to Generate Answers with Citations via Factual Consistency
  Models
Learning to Generate Answers with Citations via Factual Consistency Models
Rami Aly
Zhiqiang Tang
Samson Tan
George Karypis
HILM
34
4
0
19 Jun 2024
Think-then-Act: A Dual-Angle Evaluated Retrieval-Augmented Generation
Think-then-Act: A Dual-Angle Evaluated Retrieval-Augmented Generation
Yige Shen
Hao Jiang
Hua Qu
Jihong Zhao
RALM
LRM
27
1
0
18 Jun 2024
LightPAL: Lightweight Passage Retrieval for Open Domain Multi-Document
  Summarization
LightPAL: Lightweight Passage Retrieval for Open Domain Multi-Document Summarization
Masafumi Enomoto
Kunihiro Takeoka
Kosuke Akimoto
Kiril Gashteovski
M. Oyamada
RALM
30
1
0
18 Jun 2024
WebCanvas: Benchmarking Web Agents in Online Environments
WebCanvas: Benchmarking Web Agents in Online Environments
Yichen Pan
Dehan Kong
Sida Zhou
Cheng Cui
Yifei Leng
...
Hangyu Liu
Yanyi Shang
Shuyan Zhou
Tongshuang Wu
Zhengyang Wu
32
26
0
18 Jun 2024
Order-Optimal Instance-Dependent Bounds for Offline Reinforcement
  Learning with Preference Feedback
Order-Optimal Instance-Dependent Bounds for Offline Reinforcement Learning with Preference Feedback
Zhirui Chen
Vincent Y. F. Tan
OffRL
38
0
0
18 Jun 2024
Satyrn: A Platform for Analytics Augmented Generation
Satyrn: A Platform for Analytics Augmented Generation
Marko Sterbentz
Cameron Barrie
Shubham Shahi
Abhratanu Dutta
Donna Hooshmand
Harper Pack
Kristian J. Hammond
31
0
0
17 Jun 2024
Dialogue Action Tokens: Steering Language Models in Goal-Directed
  Dialogue with a Multi-Turn Planner
Dialogue Action Tokens: Steering Language Models in Goal-Directed Dialogue with a Multi-Turn Planner
Kenneth Li
Yiming Wang
Fernanda Viégas
Martin Wattenberg
30
6
0
17 Jun 2024
KAOS: Large Model Multi-Agent Operating System
KAOS: Large Model Multi-Agent Operating System
Zhao Zhuo
Rongzhen Li
Kai Liu
Huhai Zou
KaiMao Li
Jie Yu
Tianhao Sun
Qingbo Wu
VLM
LLMAG
41
1
0
17 Jun 2024
GUICourse: From General Vision Language Models to Versatile GUI Agents
GUICourse: From General Vision Language Models to Versatile GUI Agents
Wentong Chen
Junbo Cui
Jinyi Hu
Yujia Qin
Junjie Fang
...
Yupeng Huo
Yuan Yao
Yankai Lin
Zhiyuan Liu
Maosong Sun
LLMAG
33
30
0
17 Jun 2024
Small Agent Can Also Rock! Empowering Small Language Models as
  Hallucination Detector
Small Agent Can Also Rock! Empowering Small Language Models as Hallucination Detector
Xiaoxue Cheng
Junyi Li
Wayne Xin Zhao
Hongzhi Zhang
Fuzheng Zhang
Di Zhang
Kun Gai
Ji-Rong Wen
HILM
LLMAG
33
7
0
17 Jun 2024
A Survey on Human Preference Learning for Large Language Models
A Survey on Human Preference Learning for Large Language Models
Ruili Jiang
Kehai Chen
Xuefeng Bai
Zhixuan He
Juntao Li
Muyun Yang
Tiejun Zhao
Liqiang Nie
Min Zhang
41
8
0
17 Jun 2024
Humor in AI: Massive Scale Crowd-Sourced Preferences and Benchmarks for
  Cartoon Captioning
Humor in AI: Massive Scale Crowd-Sourced Preferences and Benchmarks for Cartoon Captioning
Jifan Zhang
Lalit P. Jain
Yang Guo
Jiayi Chen
Kuan Lok Zhou
...
Scott Sievert
Timothy Rogers
Kevin Jamieson
Robert Mankoff
Robert Nowak
33
5
0
15 Jun 2024
RoboGolf: Mastering Real-World Minigolf with a Reflective Multi-Modality
  Vision-Language Model
RoboGolf: Mastering Real-World Minigolf with a Reflective Multi-Modality Vision-Language Model
Hantao Zhou
Tianying Ji
Lukas Sommerhalder
Michael Goerner
Norman Hendrich
Jianwei Zhang
Fuchun Sun
Huazhe Xu
45
0
0
14 Jun 2024
PARSE-Ego4D: Personal Action Recommendation Suggestions for Egocentric
  Videos
PARSE-Ego4D: Personal Action Recommendation Suggestions for Egocentric Videos
Steven Abreu
Tiffany D. Do
Karan Ahuja
Eric J. Gonzalez
Lee Payne
Daniel J. McDuff
Mar González-Franco
34
2
0
14 Jun 2024
HelpSteer2: Open-source dataset for training top-performing reward
  models
HelpSteer2: Open-source dataset for training top-performing reward models
Zhilin Wang
Yi Dong
Olivier Delalleau
Jiaqi Zeng
Gerald Shen
Daniel Egert
Jimmy J. Zhang
Makesh Narsimhan Sreedhar
Oleksii Kuchaiev
AI4TS
49
83
0
12 Jun 2024
It Takes Two: On the Seamlessness between Reward and Policy Model in
  RLHF
It Takes Two: On the Seamlessness between Reward and Policy Model in RLHF
Taiming Lu
Lingfeng Shen
Xinyu Yang
Weiting Tan
Beidi Chen
Huaxiu Yao
53
2
0
12 Jun 2024
OPTune: Efficient Online Preference Tuning
OPTune: Efficient Online Preference Tuning
Lichang Chen
Jiuhai Chen
Chenxi Liu
John Kirchenbauer
Davit Soselia
Chen Zhu
Tom Goldstein
Tianyi Zhou
Heng Huang
34
4
0
11 Jun 2024
Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis
Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis
Qining Zhang
Honghao Wei
Lei Ying
OffRL
54
1
0
11 Jun 2024
RS-Agent: Automating Remote Sensing Tasks through Intelligent Agents
RS-Agent: Automating Remote Sensing Tasks through Intelligent Agents
Wenjia Xu
Zijian Yu
Yixu Wang
Jiuniu Wang
Mugen Peng
LLMAG
48
7
0
11 Jun 2024
CAAP: Context-Aware Action Planning Prompting to Solve Computer Tasks
  with Front-End UI Only
CAAP: Context-Aware Action Planning Prompting to Solve Computer Tasks with Front-End UI Only
Junhee Cho
Jihoon Kim
Daseul Bae
Jinho Choo
Youngjune Gwon
Yeong-Dae Kwon
LLMAG
34
1
0
11 Jun 2024
The Impact of Quantization on Retrieval-Augmented Generation: An
  Analysis of Small LLMs
The Impact of Quantization on Retrieval-Augmented Generation: An Analysis of Small LLMs
Mert Yazan
Suzan Verberne
F. Situmeang
MQ
34
3
0
10 Jun 2024
Information Theoretic Guarantees For Policy Alignment In Large Language
  Models
Information Theoretic Guarantees For Policy Alignment In Large Language Models
Youssef Mroueh
29
6
0
09 Jun 2024
The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Seungone Kim
Juyoung Suk
Ji Yong Cho
Shayne Longpre
Chaeeun Kim
...
Sean Welleck
Graham Neubig
Moontae Lee
Kyungjae Lee
Minjoon Seo
ELM
ALM
LM&MA
97
30
0
09 Jun 2024
CaLM: Contrasting Large and Small Language Models to Verify Grounded
  Generation
CaLM: Contrasting Large and Small Language Models to Verify Grounded Generation
I-Hung Hsu
Zifeng Wang
Long T. Le
Lesly Miculicich
Nanyun Peng
Chen-Yu Lee
Tomas Pfister
HILM
29
4
0
08 Jun 2024
Benchmark Data Contamination of Large Language Models: A Survey
Benchmark Data Contamination of Large Language Models: A Survey
Cheng Xu
Shuhao Guan
Derek Greene
Mohand-Tahar Kechadi
ELM
ALM
38
38
0
06 Jun 2024
Prototypical Reward Network for Data-Efficient RLHF
Prototypical Reward Network for Data-Efficient RLHF
Jinghan Zhang
Xiting Wang
Yiqiao Jin
Changyu Chen
Xinhao Zhang
Kunpeng Liu
ALM
41
18
0
06 Jun 2024
Previous
123...567...171819
Next