Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1706.03741
Cited By
Deep reinforcement learning from human preferences
12 June 2017
Paul Christiano
Jan Leike
Tom B. Brown
Miljan Martic
Shane Legg
Dario Amodei
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Deep reinforcement learning from human preferences"
50 / 701 papers shown
Title
Insert-expansions for Tool-enabled Conversational Agents
Andreas Göldi
Roman Rietsche
KELM
29
1
0
04 Jul 2023
Let Me Teach You: Pedagogical Foundations of Feedback for Language Models
Beatriz Borges
Niket Tandon
Tanja Käser
Antoine Bosselut
34
4
0
01 Jul 2023
On the Exploitability of Instruction Tuning
Manli Shu
Jiong Wang
Chen Zhu
Jonas Geiping
Chaowei Xiao
Tom Goldstein
SILM
47
92
0
28 Jun 2023
Fairness in Preference-based Reinforcement Learning
Umer Siddique
Abhinav Sinha
Yongcan Cao
19
4
0
16 Jun 2023
Towards AGI in Computer Vision: Lessons Learned from GPT and Large Language Models
Lingxi Xie
Longhui Wei
Xiaopeng Zhang
Kaifeng Bi
Xiaotao Gu
Jianlong Chang
Qi Tian
46
7
0
14 Jun 2023
PokemonChat: Auditing ChatGPT for Pokémon Universe Knowledge
Laura Cabello
Jiaang Li
Ilias Chalkidis
ELM
AI4MH
LRM
24
2
0
05 Jun 2023
AI Transparency in the Age of LLMs: A Human-Centered Research Roadmap
Q. V. Liao
J. Vaughan
69
159
0
02 Jun 2023
Preference-grounded Token-level Guidance for Language Model Fine-tuning
Shentao Yang
Shujian Zhang
Congying Xia
Yihao Feng
Caiming Xiong
Mi Zhou
31
23
0
01 Jun 2023
What does the Failure to Reason with "Respectively" in Zero/Few-Shot Settings Tell Us about Language Models?
Ruixiang Cui
Seolhwa Lee
Daniel Hershcovich
Anders Søgaard
35
2
0
31 May 2023
An Emergency Disposal Decision-making Method with Human--Machine Collaboration
Yibo Guo
Jingyi Xue
Yingkang Zhang
Mingliang Xu
35
0
0
29 May 2023
Ask an Expert: Leveraging Language Models to Improve Strategic Reasoning in Goal-Oriented Dialogue Models
Qiang Zhang
Jason Naradowsky
Yusuke Miyao
ELM
31
33
0
29 May 2023
ConvGenVisMo: Evaluation of Conversational Generative Vision Models
Narjes Nikzad Khasmakhi
M. Asgari-Chenaghlu
Nabiha Asghar
Philipp Schaer
Dietlind Zuhlke
11
2
0
28 May 2023
Reward Collapse in Aligning Large Language Models
Ziang Song
Tianle Cai
Jason D. Lee
Weijie J. Su
ALM
33
22
0
28 May 2023
Fine-Tuning Language Models with Just Forward Passes
Sadhika Malladi
Tianyu Gao
Eshaan Nichani
Alexandru Damian
Jason D. Lee
Danqi Chen
Sanjeev Arora
43
180
0
27 May 2023
Learning Interpretable Models of Aircraft Handling Behaviour by Reinforcement Learning from Human Feedback
Tom Bewley
J. Lawry
Arthur G. Richards
32
1
0
26 May 2023
Passive learning of active causal strategies in agents and language models
Andrew Kyle Lampinen
Stephanie C. Y. Chan
Ishita Dasgupta
A. Nam
Jane X. Wang
34
15
0
25 May 2023
The False Promise of Imitating Proprietary LLMs
Arnav Gudibande
Eric Wallace
Charles Burton Snell
Xinyang Geng
Hao Liu
Pieter Abbeel
Sergey Levine
Dawn Song
ALM
44
199
0
25 May 2023
Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback
Katherine Tian
E. Mitchell
Allan Zhou
Archit Sharma
Rafael Rafailov
Huaxiu Yao
Chelsea Finn
Christopher D. Manning
66
291
0
24 May 2023
Metamathematics of Algorithmic Composition
Michael Gogins
44
2
0
24 May 2023
Improving Factuality and Reasoning in Language Models through Multiagent Debate
Yilun Du
Shuang Li
Antonio Torralba
J. Tenenbaum
Igor Mordatch
LLMAG
LRM
79
618
0
23 May 2023
Training Diffusion Models with Reinforcement Learning
Kevin Black
Michael Janner
Yilun Du
Ilya Kostrikov
Sergey Levine
EGVM
44
320
0
22 May 2023
Making Language Models Better Tool Learners with Execution Feedback
Shuofei Qiao
Honghao Gui
Chengfei Lv
Qianghuai Jia
Huajun Chen
Ningyu Zhang
LLMAG
48
46
0
22 May 2023
On the Limitations of Simulating Active Learning
Katerina Margatina
Nikolaos Aletras
36
11
0
21 May 2023
GPT-3.5, GPT-4, or BARD? Evaluating LLMs Reasoning Ability in Zero-Shot Setting and Performance Boosting Through Prompts
Jessica Nayeli López Espejel
E. Ettifouri
Mahaman Sanoussi Yahaya Alassan
El Mehdi Chouham
Walid Dahhane
ELM
LRM
34
85
0
21 May 2023
Continually Improving Extractive QA via Human Feedback
Ge Gao
Hung-Ting Chen
Yoav Artzi
Eunsol Choi
31
12
0
21 May 2023
Collaborative Development of NLP models
Fereshte Khani
Marco Tulio Ribeiro
38
2
0
20 May 2023
A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation
Xiaowei Huang
Wenjie Ruan
Wei Huang
Gao Jin
Yizhen Dong
...
Sihao Wu
Peipei Xu
Dengyu Wu
André Freitas
Mustafa A. Mustafa
ALM
52
83
0
19 May 2023
Expanding the Role of Affective Phenomena in Multimodal Interaction Research
Leena Mathur
Maja J. Matarić
Louis-Philippe Morency
21
2
0
18 May 2023
Black-Box Targeted Reward Poisoning Attack Against Online Deep Reinforcement Learning
Yinglun Xu
Gagandeep Singh
OffRL
AAML
34
3
0
18 May 2023
Language Model Tokenizers Introduce Unfairness Between Languages
Aleksandar Petrov
Emanuele La Malfa
Philip Torr
Adel Bibi
52
98
0
17 May 2023
Beyond the Safeguards: Exploring the Security Risks of ChatGPT
Erik Derner
Kristina Batistic
SILM
45
65
0
13 May 2023
Taking Advice from ChatGPT
Peter Zhang
45
5
0
11 May 2023
Coherent Wave Dynamics and Language Generation of a Generative Pre-trained Transformer
Tao Hong
19
0
0
08 May 2023
The Current State of Summarization
Fabian Retkowski
28
6
0
08 May 2023
Divide and Prompt: Chain of Thought Prompting for Text-to-SQL
X. Liu
Zhao Tan
ReLM
LRM
37
14
0
23 Apr 2023
Auditing and Generating Synthetic Data with Controllable Trust Trade-offs
Brian M. Belgodere
Pierre Dognin
Adam Ivankay
Igor Melnyk
Youssef Mroueh
...
Mattia Rigotti
Jerret Ross
Yair Schiff
Radhika Vedpathak
Richard A. Young
39
12
0
21 Apr 2023
ChatGPT Needs SPADE (Sustainability, PrivAcy, Digital divide, and Ethics) Evaluation: A Review
Sunder Ali Khowaja
P. Khuwaja
K. Dev
Weizheng Wang
Lewis Nkenyereye
29
76
0
13 Apr 2023
Are LLMs All You Need for Task-Oriented Dialogue?
Vojtvech Hudevcek
Ondrej Dusek
31
57
0
13 Apr 2023
ChatGPT is all you need to decolonize sub-Saharan Vocational Education
Isidora Chara Tourni
G. Grigorakis
Isidoros Marougkas
Konstantinos M. Dafnis
Vassiliki‐Panagiota Tassopoulou
21
0
0
11 Apr 2023
Multi-step Jailbreaking Privacy Attacks on ChatGPT
Haoran Li
Dadi Guo
Wei Fan
Mingshi Xu
Jie Huang
Fanpu Meng
Yangqiu Song
SILM
67
323
0
11 Apr 2023
OpenAGI: When LLM Meets Domain Experts
Yingqiang Ge
Wenyue Hua
Kai Mei
Jianchao Ji
Juntao Tan
Shuyuan Xu
Zelong Li
Yongfeng Zhang
VLM
LRM
57
214
0
10 Apr 2023
VOICE: Visual Oracle for Interaction, Conversation, and Explanation
Donggang Jia
Alexandra Irger
Lonni Besancon
Ondrej Strnad
Deng Luo
Johanna Björklund
Anders Ynnerman
I. Viola
40
2
0
08 Apr 2023
REFINER: Reasoning Feedback on Intermediate Representations
Debjit Paul
Mete Ismayilzada
Maxime Peyrard
Beatriz Borges
Antoine Bosselut
Robert West
Boi Faltings
ReLM
LRM
46
171
0
04 Apr 2023
To ChatGPT, or not to ChatGPT: That is the question!
Alessandro Pegoraro
Kavita Kumari
Hossein Fereidooni
A. Sadeghi
DeLMO
43
49
0
04 Apr 2023
GPT-4 can pass the Korean National Licensing Examination for Korean Medicine Doctors
Dongyeop Jang
Tae-Rim Yun
Choong-Yeol Lee
Young-Kyu Kwon
Chang-Eop Kim
ELM
LM&MA
35
26
0
31 Mar 2023
CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society
Ge Li
Hasan Hammoud
Hani Itani
Dmitrii Khizbullin
Guohao Li
SyDa
ALM
52
413
0
31 Mar 2023
On the Creativity of Large Language Models
Giorgio Franceschelli
Mirco Musolesi
74
54
0
27 Mar 2023
Towards Solving Fuzzy Tasks with Human Feedback: A Retrospective of the MineRL BASALT 2022 Competition
Stephanie Milani
Anssi Kanervisto
Karolis Ramanauskas
Sander Schulhoff
Brandon Houghton
...
Vinicius G. Goecks
Nicholas R. Waytowich
David Watkins
J. Miller
Rohin Shah
37
16
0
23 Mar 2023
Capabilities of GPT-4 on Medical Challenge Problems
Harsha Nori
Nicholas King
S. McKinney
Dean Carignan
Eric Horvitz
LM&MA
ELM
AI4MH
46
769
0
20 Mar 2023
A Comprehensive Capability Analysis of GPT-3 and GPT-3.5 Series Models
Junjie Ye
Xuanting Chen
Nuo Xu
Can Zu
Zekai Shao
...
Jie Zhou
Siming Chen
Tao Gui
Qi Zhang
Xuanjing Huang
ELM
38
312
0
18 Mar 2023
Previous
1
2
3
...
9
10
11
...
13
14
15
Next