ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.18688
  4. Cited By
Efficient Preference-based Reinforcement Learning via Aligned Experience
  Estimation

Efficient Preference-based Reinforcement Learning via Aligned Experience Estimation

29 May 2024
Fengshuo Bai
Rui Zhao
Hongming Zhang
Sijia Cui
Ying Wen
Yaodong Yang
Bo Xu
Lei Han
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Efficient Preference-based Reinforcement Learning via Aligned Experience Estimation"

9 / 9 papers shown
STAIR: Addressing Stage Misalignment through Temporal-Aligned Preference Reinforcement Learning
STAIR: Addressing Stage Misalignment through Temporal-Aligned Preference Reinforcement Learning
Yao Luan
Ni Mu
Yiqin Yang
Bo Xu
Qing-Shan Jia
95
0
0
28 Sep 2025
DexFlyWheel: A Scalable and Self-improving Data Generation Framework for Dexterous Manipulation
DexFlyWheel: A Scalable and Self-improving Data Generation Framework for Dexterous Manipulation
Kefei Zhu
Fengshuo Bai
YuanHao Xiang
Yishuai Cai
Xinglin Chen
...
X. Wang
Hao Dong
Yaodong Yang
Xiaopeng Fan
Yuanpei Chen
96
3
0
28 Sep 2025
Spec-VLA: Speculative Decoding for Vision-Language-Action Models with Relaxed Acceptance
Spec-VLA: Speculative Decoding for Vision-Language-Action Models with Relaxed Acceptance
Songsheng Wang
Rucheng Yu
Zhihang Yuan
Chao Yu
Feng Gao
Yu-Ping Wang
Derek F. Wong
181
7
0
30 Jul 2025
$β$-DQN: Improving Deep Q-Learning By Evolving the Behavior
βββ-DQN: Improving Deep Q-Learning By Evolving the BehaviorAdaptive Agents and Multi-Agent Systems (AAMAS), 2025
Hongming Zhang
Fengshuo Bai
Chenjun Xiao
Chao Gao
Bo Xu
Martin Müller
OffRL
374
3
0
01 Jan 2025
RAT: Adversarial Attacks on Deep Reinforcement Agents for Targeted
  Behaviors
RAT: Adversarial Attacks on Deep Reinforcement Agents for Targeted BehaviorsAAAI Conference on Artificial Intelligence (AAAI), 2024
Fengshuo Bai
Runze Liu
Yali Du
Ying Wen
Yaodong Yang
AAML
334
12
0
14 Dec 2024
Utilize the Flow before Stepping into the Same River Twice: Certainty
  Represented Knowledge Flow for Refusal-Aware Instruction Tuning
Utilize the Flow before Stepping into the Same River Twice: Certainty Represented Knowledge Flow for Refusal-Aware Instruction TuningAAAI Conference on Artificial Intelligence (AAAI), 2024
Runchuan Zhu
Zhipeng Ma
Jiang Wu
Junyuan Gao
Jiaqi Wang
Dahua Lin
Conghui He
180
6
0
09 Oct 2024
Bench2Drive: Towards Multi-Ability Benchmarking of Closed-Loop
  End-To-End Autonomous Driving
Bench2Drive: Towards Multi-Ability Benchmarking of Closed-Loop End-To-End Autonomous Driving
Xiaosong Jia
Zhenjie Yang
Qifeng Li
Zhiyuan Zhang
Junchi Yan
ELM
418
137
0
06 Jun 2024
PEARL: Zero-shot Cross-task Preference Alignment and Robust Reward
  Learning for Robotic Manipulation
PEARL: Zero-shot Cross-task Preference Alignment and Robust Reward Learning for Robotic ManipulationInternational Conference on Machine Learning (ICML), 2023
Runze Liu
Yali Du
Fengshuo Bai
Jiafei Lyu
Xiu Li
350
9
0
06 Jun 2023
robosuite: A Modular Simulation Framework and Benchmark for Robot Learning
robosuite: A Modular Simulation Framework and Benchmark for Robot Learning
Yuke Zhu
J. Wong
Ajay Mandlekar
Roberto Martín-Martín
Abhishek Joshi
Soroush Nasiriany
Yifeng Zhu
Soroush Nasiriany
Yifeng Zhu
525
562
0
25 Sep 2020
1