ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1208.0984
  4. Cited By
APRIL: Active Preference-learning based Reinforcement Learning

APRIL: Active Preference-learning based Reinforcement Learning

5 August 2012
R. Akrour
Marc Schoenauer
Michèle Sebag
    OffRL
ArXiv (abs)PDFHTML

Papers citing "APRIL: Active Preference-learning based Reinforcement Learning"

50 / 61 papers shown
Combining Bayesian Inference and Reinforcement Learning for Agent Decision Making: A Review
Combining Bayesian Inference and Reinforcement Learning for Agent Decision Making: A Review
Chengmin Zhou
Ville Kyrki
Pasi Fränti
Laura Ruotsalainen
BDLAI4CE
536
1
0
12 May 2025
AED: Automatic Discovery of Effective and Diverse Vulnerabilities for Autonomous Driving Policy with Large Language Models
AED: Automatic Discovery of Effective and Diverse Vulnerabilities for Autonomous Driving Policy with Large Language Models
Le Qiu
Zelai Xu
Qixin Tan
Wenhao Tang
Xinlei Chen
Yu Wang
AAML
371
0
0
24 Mar 2025
Advances in Preference-based Reinforcement Learning: A Review
Advances in Preference-based Reinforcement Learning: A ReviewIEEE International Conference on Systems, Man and Cybernetics (SMC), 2022
Youssef Abdelkareem
Shady Shehata
Fakhri Karray
OffRL
304
18
0
21 Aug 2024
Clinical Reading Comprehension with Encoder-Decoder Models Enhanced by
  Direct Preference Optimization
Clinical Reading Comprehension with Encoder-Decoder Models Enhanced by Direct Preference Optimization
Md Sultan al Nahian
R. Kavuluru
MedImAI4CE
234
0
0
19 Jul 2024
Learning Human-Robot Handshaking Preferences for Quadruped Robots
Learning Human-Robot Handshaking Preferences for Quadruped Robots
Alessandra Chappuis
Guillaume Bellegarda
A. Ijspeert
322
2
0
28 Jun 2024
A Survey on Human Preference Learning for Large Language Models
A Survey on Human Preference Learning for Large Language Models
Ruili Jiang
Kehai Chen
Xuefeng Bai
Zhixuan He
Juntao Li
Muyun Yang
Tiejun Zhao
Liqiang Nie
Min Zhang
364
21
0
17 Jun 2024
Reinforcement learning in large, structured action spaces: A simulation
  study of decision support for spinal cord injury rehabilitation
Reinforcement learning in large, structured action spaces: A simulation study of decision support for spinal cord injury rehabilitationIntelligent Medicine (IM), 2023
Nathan Phelps
Stephanie Marrocco
Stephanie Cornell
Dalton L. Wolfe
Daniel J. Lizotte
AI4CE
199
0
0
23 Oct 2023
AlignDiff: Aligning Diverse Human Preferences via Behavior-Customisable
  Diffusion Model
AlignDiff: Aligning Diverse Human Preferences via Behavior-Customisable Diffusion ModelInternational Conference on Learning Representations (ICLR), 2023
Zibin Dong
Yifu Yuan
Jianye Hao
Fei Ni
Yao Mu
Yan Zheng
Yujing Hu
Tangjie Lv
Changjie Fan
Zhipeng Hu
338
44
0
03 Oct 2023
Reinforcement Learning with Human Feedback for Realistic Traffic
  Simulation
Reinforcement Learning with Human Feedback for Realistic Traffic SimulationIEEE International Conference on Robotics and Automation (ICRA), 2023
Yulong Cao
Boris Ivanovic
Chaowei Xiao
Marco Pavone
188
25
0
01 Sep 2023
Active Inverse Learning in Stackelberg Trajectory Games
Active Inverse Learning in Stackelberg Trajectory GamesAmerican Control Conference (ACC), 2023
Yue Yu
Jacob Levy
Negar Mehr
David Fridovich-Keil
Ufuk Topcu
173
10
0
15 Aug 2023
Toward Grounded Commonsense Reasoning
Toward Grounded Commonsense ReasoningIEEE International Conference on Robotics and Automation (ICRA), 2023
Minae Kwon
Hengyuan Hu
Vivek Myers
Siddharth Karamcheti
Anca Dragan
Dorsa Sadigh
LM&RoReLMLRM
314
20
0
14 Jun 2023
PAGAR: Taming Reward Misalignment in Inverse Reinforcement
  Learning-Based Imitation Learning with Protagonist Antagonist Guided
  Adversarial Reward
PAGAR: Taming Reward Misalignment in Inverse Reinforcement Learning-Based Imitation Learning with Protagonist Antagonist Guided Adversarial Reward
Weichao Zhou
Wenchao Li
360
0
0
02 Jun 2023
Shattering the Agent-Environment Interface for Fine-Tuning Inclusive
  Language Models
Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models
Wanqiao Xu
Shi Dong
Dilip Arumugam
Benjamin Van Roy
219
9
0
19 May 2023
Learning a Universal Human Prior for Dexterous Manipulation from Human
  Preference
Learning a Universal Human Prior for Dexterous Manipulation from Human Preference
Zihan Ding
Yuanpei Chen
Allen Z. Ren
S. Gu
Qianxu Wang
Hao Dong
Chi Jin
269
11
0
10 Apr 2023
Vision-Language Models as Success Detectors
Vision-Language Models as Success Detectors
Yuqing Du
Ksenia Konyushkova
Misha Denil
A. Raju
Jessica Landon
Felix Hill
Nando de Freitas
Serkan Cabi
MLLMLRM
444
133
0
13 Mar 2023
Eliciting User Preferences for Personalized Multi-Objective Decision
  Making through Comparative Feedback
Eliciting User Preferences for Personalized Multi-Objective Decision Making through Comparative FeedbackNeural Information Processing Systems (NeurIPS), 2023
Han Shao
Lee Cohen
Avrim Blum
Yishay Mansour
Aadirupa Saha
Matthew R. Walter
OffRL
362
8
0
07 Feb 2023
Improving Multimodal Interactive Agents with Reinforcement Learning from
  Human Feedback
Improving Multimodal Interactive Agents with Reinforcement Learning from Human Feedback
Josh Abramson
Arun Ahuja
Federico Carnevale
Petko Georgiev
Alex Goldin
...
Tamara von Glehn
Greg Wayne
Nathaniel Wong
Chen Yan
Rui Zhu
262
38
0
21 Nov 2022
Efficient Meta Reinforcement Learning for Preference-based Fast
  Adaptation
Efficient Meta Reinforcement Learning for Preference-based Fast AdaptationNeural Information Processing Systems (NeurIPS), 2022
Zhizhou Ren
Hoang Trung-Dung
Yitao Liang
Jian-wei Peng
Jianzhu Ma
242
10
0
20 Nov 2022
Rewards Encoding Environment Dynamics Improves Preference-based
  Reinforcement Learning
Rewards Encoding Environment Dynamics Improves Preference-based Reinforcement Learning
Katherine Metcalf
Miguel Sarabia
B. Theobald
OffRL
206
5
0
12 Nov 2022
Argumentative Reward Learning: Reasoning About Human Preferences
Argumentative Reward Learning: Reasoning About Human Preferences
Francis Rhys Ward
Francesco Belardinelli
Francesca Toni
HAI
312
2
0
28 Sep 2022
Learning Latent Traits for Simulated Cooperative Driving Tasks
Learning Latent Traits for Simulated Cooperative Driving Tasks
Jonathan A. DeCastro
Deepak Gopinath
Guy Rosman
Emily S. Sumner
Shabnam Hakimi
Simon Stent
257
0
0
20 Jul 2022
Personalized Algorithmic Recourse with Preference Elicitation
Personalized Algorithmic Recourse with Preference Elicitation
Giovanni De Toni
P. Viappiani
Stefano Teso
Bruno Lepri
Baptiste Caramiaux
607
13
0
27 May 2022
Invariance in Policy Optimisation and Partial Identifiability in Reward
  Learning
Invariance in Policy Optimisation and Partial Identifiability in Reward LearningInternational Conference on Machine Learning (ICML), 2022
Joar Skalse
Matthew Farrugia-Roberts
Stuart J. Russell
Alessandro Abate
Adam Gleave
337
55
0
14 Mar 2022
Uncertainty Estimation for Language Reward Models
Uncertainty Estimation for Language Reward Models
Adam Gleave
G. Irving
UQLM
203
38
0
14 Mar 2022
Reinforcement Learning in Modern Biostatistics: Constructing Optimal
  Adaptive Interventions
Reinforcement Learning in Modern Biostatistics: Constructing Optimal Adaptive InterventionsInternational Statistical Review (ISR), 2022
Nina Deliu
Joseph Jay Williams
B. Chakraborty
OffRL
314
21
0
04 Mar 2022
Interpretable Preference-based Reinforcement Learning with
  Tree-Structured Reward Functions
Interpretable Preference-based Reinforcement Learning with Tree-Structured Reward FunctionsAdaptive Agents and Multi-Agent Systems (AAMAS), 2021
Tom Bewley
Freddy Lecue
OffRL
307
14
0
20 Dec 2021
Scientific Discovery and the Cost of Measurement -- Balancing
  Information and Cost in Reinforcement Learning
Scientific Discovery and the Cost of Measurement -- Balancing Information and Cost in Reinforcement Learning
C. Bellinger
Andriy Drozdyuk
Mark Crowley
Isaac Tamblyn
OffRL
275
10
0
14 Dec 2021
Dueling RL: Reinforcement Learning with Trajectory Preferences
Dueling RL: Reinforcement Learning with Trajectory Preferences
Aldo Pacchiano
Aadirupa Saha
Jonathan Lee
420
109
0
08 Nov 2021
Learning Multimodal Rewards from Rankings
Learning Multimodal Rewards from RankingsConference on Robot Learning (CoRL), 2021
Vivek Myers
Erdem Biyik
Nima Anari
Dorsa Sadigh
OffRL
301
62
0
27 Sep 2021
PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via
  Relabeling Experience and Unsupervised Pre-training
PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-trainingInternational Conference on Machine Learning (ICML), 2021
Kimin Lee
Laura M. Smith
Pieter Abbeel
OffRL
536
377
0
09 Jun 2021
Information Directed Reward Learning for Reinforcement Learning
Information Directed Reward Learning for Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2021
David Lindner
M. Turchetta
Sebastian Tschiatschek
K. Ciosek
Andreas Krause
OffRL
278
26
0
24 Feb 2021
Open Problems in Cooperative AI
Open Problems in Cooperative AI
Allan Dafoe
Edward Hughes
Yoram Bachrach
Tantum Collins
Kevin R. McKee
Joel Z Leibo
Kate Larson
T. Graepel
485
254
0
15 Dec 2020
Human-guided Robot Behavior Learning: A GAN-assisted Preference-based
  Reinforcement Learning Approach
Human-guided Robot Behavior Learning: A GAN-assisted Preference-based Reinforcement Learning Approach
Huixin Zhan
Feng Tao
Yongcan Cao
256
30
0
15 Oct 2020
Reward Machines: Exploiting Reward Function Structure in Reinforcement
  Learning
Reward Machines: Exploiting Reward Function Structure in Reinforcement LearningJournal of Artificial Intelligence Research (JAIR), 2020
Rodrigo Toro Icarte
Toryn Q. Klassen
Richard Valenzano
Sheila A. McIlraith
OffRL
541
299
0
06 Oct 2020
Learning Reward Functions from Diverse Sources of Human Feedback:
  Optimally Integrating Demonstrations and Preferences
Learning Reward Functions from Diverse Sources of Human Feedback: Optimally Integrating Demonstrations and Preferences
Erdem Biyik
Dylan P. Losey
Malayandi Palan
Nicholas C. Landolfi
Gleb Shevchuk
Dorsa Sadigh
412
140
0
24 Jun 2020
Active Measure Reinforcement Learning for Observation Cost Minimization
Active Measure Reinforcement Learning for Observation Cost Minimization
C. Bellinger
Rory Coles
Mark Crowley
Isaac Tamblyn
OffRL
184
29
0
26 May 2020
Active Preference-Based Gaussian Process Regression for Reward Learning
Active Preference-Based Gaussian Process Regression for Reward Learning
Erdem Biyik
Nicolas Huynh
Mykel J. Kochenderfer
Dorsa Sadigh
GP
359
131
0
06 May 2020
Reducing Non-Normative Text Generation from Language Models
Reducing Non-Normative Text Generation from Language Models
Xiangyu Peng
Siyan Li
Spencer Frazier
Mark O. Riedl
257
8
0
23 Jan 2020
Learning Norms from Stories: A Prior for Value Aligned Agents
Learning Norms from Stories: A Prior for Value Aligned AgentsAAAI/ACM Conference on AI, Ethics, and Society (AIES), 2019
Spencer Frazier
Md Sultan al Nahian
Mark O. Riedl
Brent Harrison
198
41
0
07 Dec 2019
Reinforcing an Image Caption Generator Using Off-Line Human Feedback
Reinforcing an Image Caption Generator Using Off-Line Human FeedbackAAAI Conference on Artificial Intelligence (AAAI), 2019
Paul Hongsuck Seo
Piyush Sharma
Tomer Levinboim
Bohyung Han
Radu Soricut
OffRL
260
24
0
21 Nov 2019
Context-aware Active Multi-Step Reinforcement Learning
Context-aware Active Multi-Step Reinforcement Learning
Gang Chen
Dingcheng Li
Ran Xu
167
0
0
11 Nov 2019
Asking Easy Questions: A User-Friendly Approach to Active Reward
  Learning
Asking Easy Questions: A User-Friendly Approach to Active Reward LearningConference on Robot Learning (CoRL), 2019
Erdem Biyik
Malayandi Palan
Nicholas C. Landolfi
Dylan P. Losey
Dorsa Sadigh
213
137
0
10 Oct 2019
Scaling data-driven robotics with reward sketching and batch
  reinforcement learning
Scaling data-driven robotics with reward sketching and batch reinforcement learning
Serkan Cabi
Sergio Gomez Colmenarejo
Alexander Novikov
Ksenia Konyushkova
Scott E. Reed
...
David Barker
Jonathan Scholz
Misha Denil
Nando de Freitas
Ziyun Wang
OffRL
366
30
0
26 Sep 2019
Reinforcement Learning in Healthcare: A Survey
Reinforcement Learning in Healthcare: A SurveyACM Computing Surveys (ACM CSUR), 2019
Chao Yu
Jiming Liu
S. Nemati
LM&MAOffRL
801
733
0
22 Aug 2019
Dueling Posterior Sampling for Preference-Based Reinforcement Learning
Dueling Posterior Sampling for Preference-Based Reinforcement LearningConference on Uncertainty in Artificial Intelligence (UAI), 2019
Ellen R. Novoseller
Yibing Wei
Yanan Sui
Yisong Yue
J. W. Burdick
546
73
0
04 Aug 2019
Improving User Specifications for Robot Behavior through Active
  Preference Learning: Framework and Evaluation
Improving User Specifications for Robot Behavior through Active Preference Learning: Framework and Evaluation
Nils Wilde
Alex Blidaru
Stephen L. Smith
Dana Kulić
205
38
0
24 Jul 2019
Learning Reward Functions by Integrating Human Demonstrations and
  Preferences
Learning Reward Functions by Integrating Human Demonstrations and Preferences
Malayandi Palan
Nicholas C. Landolfi
Gleb Shevchuk
Dorsa Sadigh
172
146
0
21 Jun 2019
Batch Active Learning Using Determinantal Point Processes
Batch Active Learning Using Determinantal Point Processes
Erdem Biyik
Kenneth Wang
Nima Anari
Dorsa Sadigh
320
71
0
19 Jun 2019
The Green Choice: Learning and Influencing Human Decisions on Shared
  Roads
The Green Choice: Learning and Influencing Human Decisions on Shared Roads
Erdem Biyik
Daniel A. Lazar
Dorsa Sadigh
Ramtin Pedarsani
184
31
0
03 Apr 2019
Parenting: Safe Reinforcement Learning from Human Input
Parenting: Safe Reinforcement Learning from Human Input
Christopher Frye
Ilya Feige
182
10
0
18 Feb 2019
12
Next
Page 1 of 2