Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
1711.02827
Cited By
v1
v2 (latest)
Inverse Reward Design
8 November 2017
Dylan Hadfield-Menell
S. Milli
Pieter Abbeel
Stuart J. Russell
Anca Dragan
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Inverse Reward Design"
50 / 265 papers shown
Title
Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms
Rafael Rafailov
Yaswanth Chittepu
Ryan Park
Harshit S. Sikchi
Joey Hejna
Bradley Knox
Chelsea Finn
S. Niekum
354
94
0
05 Jun 2024
REvolve: Reward Evolution with Large Language Models using Human Feedback
Rishi Hazra
Alkis Sygkounas
Andreas Persson
Amy Loutfi
Pedro Zuidberg Dos Martires
338
3
0
03 Jun 2024
Pragmatic Feature Preferences: Learning Reward-Relevant Preferences from Human Input
International Conference on Machine Learning (ICML), 2024
Andi Peng
Yuying Sun
Tianmin Shu
David Abel
194
4
0
23 May 2024
Large Language Model (LLM) for Telecommunications: A Comprehensive Survey on Principles, Key Techniques, and Opportunities
IEEE Communications Surveys and Tutorials (COMST), 2024
Hao Zhou
Chengming Hu
Ye Yuan
Yufei Cui
Yili Jin
...
Di Wu
Xue Liu
Charlie Zhang
Xianbin Wang
Jiangchuan Liu
268
165
0
17 May 2024
A Design Trajectory Map of Human-AI Collaborative Reinforcement Learning Systems: Survey and Taxonomy
Zhaoxing Li
176
2
0
16 May 2024
Learning Reward for Robot Skills Using Large Language Models via Self-Alignment
Yuwei Zeng
Yao Mu
Lin Shao
334
22
0
12 May 2024
Leveraging Sub-Optimal Data for Human-in-the-Loop Reinforcement Learning
Calarina Muslimani
Matthew E. Taylor
OffRL
366
3
0
30 Apr 2024
RLHF Deciphered: A Critical Analysis of Reinforcement Learning from Human Feedback for LLMs
Shreyas Chaudhari
Pranjal Aggarwal
Vishvak Murahari
Tanmay Rajpurohit
Ashwin Kalyan
Karthik Narasimhan
Ameet Deshpande
Bruno Castro da Silva
370
83
0
12 Apr 2024
Q-ITAGS: Quality-Optimized Spatio-Temporal Heterogeneous Task Allocation with a Time Budget
Glen Neville
Jiazhen Liu
Sonia Chernova
Harish Ravichandar
158
1
0
11 Apr 2024
Survey on Large Language Model-Enhanced Reinforcement Learning: Concept, Taxonomy, and Methods
Yuji Cao
Huan Zhao
Yuheng Cheng
Ting Shu
Guolong Liu
Gaoqi Liang
Junhua Zhao
Yun Li
LLMAG
KELM
OffRL
LM&Ro
364
146
0
30 Mar 2024
Scaling Learning based Policy Optimization for Temporal Tasks via Dropout
Navid Hashemi
Bardh Hoxha
Danil Prokhorov
Georgios Fainekos
Jyotirmoy Deshmukh
152
2
0
23 Mar 2024
Heuristic Algorithm-based Action Masking Reinforcement Learning (HAAM-RL) with Ensemble Inference Method
Kyuwon Choi
Cheolkyun Rho
Taeyoun Kim
D. Choi
OffRL
130
0
0
21 Mar 2024
Correlated Proxies: A New Definition and Improved Mitigation for Reward Hacking
Cassidy Laidlaw
Shivam Singhal
Anca Dragan
AAML
334
11
0
05 Mar 2024
Sample-Efficient Preference-based Reinforcement Learning with Dynamics Aware Rewards
Katherine Metcalf
Miguel Sarabia
Natalie Mackraz
B. Theobald
184
10
0
28 Feb 2024
Advancing Investment Frontiers: Industry-grade Deep Reinforcement Learning for Portfolio Optimization
Philip Ndikum
Serge Ndikum
248
7
0
27 Feb 2024
PREDILECT: Preferences Delineated with Zero-Shot Language-based Reasoning in Reinforcement Learning
Simon Holk
Daniel Marta
Iolanda Leite
191
17
0
23 Feb 2024
Agents Need Not Know Their Purpose
Paulo Garcia
120
0
0
15 Feb 2024
Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human Feedback
Yifu Yuan
Jianye Hao
Yi-An Ma
Zibin Dong
Hebin Liang
Jinyi Liu
Zhixin Feng
Kai-Wen Zhao
Yan Zheng
OffRL
ALM
307
19
0
04 Feb 2024
Inverse Reinforcement Learning by Estimating Expertise of Demonstrators
M. Beliaev
Ramtin Pedarsani
287
6
0
02 Feb 2024
Concept Alignment
Sunayana Rane
Polyphony J. Bruna
Ilia Sucholutsky
Christopher Kello
Thomas Griffiths
CVBM
140
15
0
09 Jan 2024
Global Rewards in Multi-Agent Deep Reinforcement Learning for Autonomous Mobility on Demand Systems
Conference on Learning for Dynamics & Control (L4DC), 2023
Heiko Hoppe
Tobias Enders
Quentin Cappart
Maximilian Schiffer
271
9
0
14 Dec 2023
Cross Fertilizing Empathy from Brain to Machine as a Value Alignment Strategy
Devin Gonier
Adrian Adduci
Cassidy LoCascio
123
0
0
10 Dec 2023
Increasing Transparency of Reinforcement Learning using Shielding for Human Preferences and Explanations
Georgios Angelopoulos
Luigi Mangiacapra
Alessandra Rossi
C. Napoli
Silvia Rossi
103
1
0
28 Nov 2023
Learning Reward for Physical Skills using Large Language Model
Yuwei Zeng
Yiqing Xu
183
7
0
21 Oct 2023
Eureka: Human-Level Reward Design via Coding Large Language Models
Yecheng Jason Ma
William Liang
Guanzhi Wang
De-An Huang
Osbert Bastani
Dinesh Jayaraman
Yuke Zhu
Linxi Fan
A. Anandkumar
264
462
0
19 Oct 2023
Getting aligned on representational alignment
Ilia Sucholutsky
Lukas Muttenthaler
Adrian Weller
Andi Peng
Andreea Bobu
...
Thomas Unterthiner
Andrew Kyle Lampinen
Klaus-Robert Muller
M. Toneva
Thomas Griffiths
290
132
0
18 Oct 2023
RoboCLIP: One Demonstration is Enough to Learn Robot Policies
Neural Information Processing Systems (NeurIPS), 2023
Sumedh Anand Sontakke
Jesse Zhang
Sébastien M. R. Arnold
Karl Pertsch
Erdem Biyik
Dorsa Sadigh
Chelsea Finn
Laurent Itti
OffRL
198
112
0
11 Oct 2023
Dynamic value alignment through preference aggregation of multiple objectives
Marcin Korecki
Damian Dailisan
Cesare Carissimo
228
1
0
09 Oct 2023
AlignDiff: Aligning Diverse Human Preferences via Behavior-Customisable Diffusion Model
International Conference on Learning Representations (ICLR), 2023
Zibin Dong
Yifu Yuan
Jianye Hao
Fei Ni
Yao Mu
Yan Zheng
Yujing Hu
Tangjie Lv
Changjie Fan
Zhipeng Hu
211
38
0
03 Oct 2023
Large Language Model Alignment: A Survey
Shangda Wu
Renren Jin
Yufei Huang
Chuang Liu
Weilong Dong
Zishan Guo
Xinwei Wu
Yan Liu
Deyi Xiong
LM&MA
348
277
0
26 Sep 2023
Hierarchical Imitation Learning for Stochastic Environments
IEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023
Maximilian Igl
Punit Shah
Paul Mougin
S. Srinivasan
Tarun Gupta
Brandyn White
K. Shiarlis
Shimon Whiteson
OOD
165
3
0
25 Sep 2023
Human-Centered Autonomy for UAS Target Search
IEEE International Conference on Robotics and Automation (ICRA), 2023
Hunter M. Ray
Zakariya Laouar
Zachary Sunberg
Nisar R. Ahmed
184
3
0
12 Sep 2023
Learning Shared Safety Constraints from Multi-task Demonstrations
Neural Information Processing Systems (NeurIPS), 2023
Konwoo Kim
Gokul Swamy
Zuxin Liu
Ding Zhao
Sanjiban Choudhury
Zhiwei Steven Wu
197
23
0
01 Sep 2023
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Stephen Casper
Xander Davies
Claudia Shi
T. Gilbert
Jérémy Scheurer
...
Erdem Biyik
Anca Dragan
David M. Krueger
Dorsa Sadigh
Dylan Hadfield-Menell
ALM
OffRL
353
701
0
27 Jul 2023
Designing Fiduciary Artificial Intelligence
Conference on Equity and Access in Algorithms, Mechanisms, and Optimization (EAAMO), 2023
Sebastian Benthall
David Shekman
160
8
0
27 Jul 2023
Boosting Feedback Efficiency of Interactive Reinforcement Learning by Adaptive Learning from Scores
IEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023
Shukai Liu
Chenming Wu
Ying Li
Liang Zhang
257
1
0
11 Jul 2023
Towards Regulatable AI Systems: Technical Gaps and Policy Opportunities
Xudong Shen
H. Brown
Jiashu Tao
Martin Strobel
Yao Tong
Akshay Narayan
Harold Soh
Finale Doshi-Velez
321
3
0
22 Jun 2023
SeMAIL: Eliminating Distractors in Visual Imitation via Separated Models
International Conference on Machine Learning (ICML), 2023
Shenghua Wan
Yucen Wang
Minghao Shao
Ruying Chen
De-Chuan Zhan
263
12
0
19 Jun 2023
Survival Instinct in Offline Reinforcement Learning
Neural Information Processing Systems (NeurIPS), 2023
Anqi Li
Dipendra Kumar Misra
Andrey Kolobov
Ching-An Cheng
OffRL
245
20
0
05 Jun 2023
PAGAR: Taming Reward Misalignment in Inverse Reinforcement Learning-Based Imitation Learning with Protagonist Antagonist Guided Adversarial Reward
Weichao Zhou
Wenchao Li
244
0
0
02 Jun 2023
Query-Policy Misalignment in Preference-Based Reinforcement Learning
International Conference on Learning Representations (ICLR), 2023
Xiao Hu
Jianxiong Li
Xianyuan Zhan
Qing-Shan Jia
Ya Zhang
297
14
0
27 May 2023
Beyond Reward: Offline Preference-guided Policy Optimization
International Conference on Machine Learning (ICML), 2023
Yachen Kang
Dingxu Shi
Jinxin Liu
Li He
Xuetao Zhang
OffRL
179
38
0
25 May 2023
Inverse Preference Learning: Preference-based RL without a Reward Function
Neural Information Processing Systems (NeurIPS), 2023
Joey Hejna
Dorsa Sadigh
OffRL
265
72
0
24 May 2023
Reward Learning with Intractable Normalizing Functions
IEEE Robotics and Automation Letters (RA-L), 2023
Joshua Hoegerman
Dylan P. Losey
170
2
0
16 May 2023
Fine-tuning Language Models with Generative Adversarial Reward Modelling
Z. Yu
Lau Jia Jaw
Zhang Hui
Bryan Kian Hsiang Low
ALM
192
6
0
09 May 2023
Diagnosing and Augmenting Feature Representations in Correctional Inverse Reinforcement Learning
Inês Lourenço
Andreea Bobu
C. Rojas
B. Wahlberg
200
0
0
11 Apr 2023
Safe MDP Planning by Learning Temporal Patterns of Undesirable Trajectories and Averting Negative Side Effects
International Conference on Automated Planning and Scheduling (ICAPS), 2023
Siow Meng Low
Akshat Kumar
Scott Sanner
116
2
0
06 Apr 2023
Kernel Density Bayesian Inverse Reinforcement Learning
Aishwarya Mandyam
Didong Li
Jiayu Yao
Diana Cai
Andrew Jones
Barbara E. Engelhardt
OffRL
BDL
267
3
0
13 Mar 2023
Active Reward Learning from Multiple Teachers
Peter Barnett
Rachel Freedman
Justin Svegliato
Stuart J. Russell
156
17
0
02 Mar 2023
Reward Design with Language Models
International Conference on Learning Representations (ICLR), 2023
Minae Kwon
Sang Michael Xie
Kalesha Bullard
Dorsa Sadigh
LM&Ro
387
277
0
27 Feb 2023
Previous
1
2
3
4
5
6
Next