ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.10696
  4. Cited By
Novel Policy Seeking with Constrained Optimization
v1v2v3 (latest)

Novel Policy Seeking with Constrained Optimization

21 May 2020
Hao Sun
Zhenghao Peng
Bo Dai
Jian Guo
Dahua Lin
Bolei Zhou
ArXiv (abs)PDFHTML

Papers citing "Novel Policy Seeking with Constrained Optimization"

7 / 7 papers shown
Title
$β$-DQN: Improving Deep Q-Learning By Evolving the Behavior
βββ-DQN: Improving Deep Q-Learning By Evolving the BehaviorAdaptive Agents and Multi-Agent Systems (AAMAS), 2025
Hongming Zhang
Fengshuo Bai
Chenjun Xiao
Chao Gao
Bo Xu
Martin Müller
OffRL
168
4
0
01 Jan 2025
Iteratively Learn Diverse Strategies with State Distance Information
Iteratively Learn Diverse Strategies with State Distance InformationNeural Information Processing Systems (NeurIPS), 2023
Wei Fu
Weihua Du
Jingwei Li
Sunli Chen
Jingzhao Zhang
Yi Wu
182
5
0
23 Oct 2023
Keep Various Trajectories: Promoting Exploration of Ensemble Policies in
  Continuous Control
Keep Various Trajectories: Promoting Exploration of Ensemble Policies in Continuous ControlNeural Information Processing Systems (NeurIPS), 2023
Chao Li
Chen Gong
Qiang He
Xinwen Hou
119
2
0
17 Oct 2023
Reinforcement Learning in the Era of LLMs: What is Essential? What is
  needed? An RL Perspective on RLHF, Prompting, and Beyond
Reinforcement Learning in the Era of LLMs: What is Essential? What is needed? An RL Perspective on RLHF, Prompting, and Beyond
Hao Sun
OffRL
117
25
0
09 Oct 2023
Diversifying AI: Towards Creative Chess with AlphaZero
Diversifying AI: Towards Creative Chess with AlphaZero
Tom Zahavy
Vivek Veeriah
Shaobo Hou
Kevin Waugh
Matthew Lai
Edouard Leurent
Nenad Tomašev
Lisa Schut
Demis Hassabis
Satinder Singh
187
20
0
17 Aug 2023
Continuously Discovering Novel Strategies via Reward-Switching Policy
  Optimization
Continuously Discovering Novel Strategies via Reward-Switching Policy OptimizationInternational Conference on Learning Representations (ICLR), 2022
Zihan Zhou
Wei Fu
Bingliang Zhang
Yi Wu
160
34
0
04 Apr 2022
Discovering Diverse Athletic Jumping Strategies
Discovering Diverse Athletic Jumping StrategiesACM Transactions on Graphics (TOG), 2021
Zhiqi Yin
Zeshi Yang
M. van de Panne
KangKang Yin
156
47
0
02 May 2021
1