v1v2 (latest)

Proximal Policy Optimization Algorithms

20 July 2017

Papers citing "Proximal Policy Optimization Algorithms"

50 / 11,421 papers shown

Inference-Time Policy Adapters (IPA): Tailoring Extreme-Scale LMs without Fine-tuningConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Faeze Brahman

...

Xiang Ren

Yejin Choi

326

24 May 2023

Leftover Lunch: Advantage-based Offline Reinforcement Learning for Language ModelsInternational Conference on Learning Representations (ICLR), 2023

Faeze Brahman

349

24 May 2023

Barkour: Benchmarking Animal-level Agility with Quadruped Robots

Ken Caluwaerts

Atil Iscen

J. Kew

Wenhao Yu

Tingnan Zhang

...

Jie Tan

225

24 May 2023

Inverse Reinforcement Learning with the Average Reward CriterionNeural Information Processing Systems (NeurIPS), 2023

Feiyang Wu

Jingyang Ke

Anqi Wu

280

24 May 2023

Adaptive Policy Learning to Additional Tasks

220

24 May 2023

MARC: A multi-agent robots control framework for enhancing reinforcement learning in construction tasks

Kangkang Duan

C. W. Suen

Zhengbo Zou

116

23 May 2023

Learning from demonstrations: An intuitive VR environment for imitation learning of construction robots

Kangkang Duan

Zhengbo Zou

132

23 May 2023

RetICL: Sequential Retrieval of In-Context Examples with Reinforcement Learning

Alexander Scarlatos

Andrew Lan

OffRL LRM

260

23 May 2023

Language Model Self-improvement by Reinforcement Learning ContemplationInternational Conference on Learning Representations (ICLR), 2023

229

23 May 2023

Query Rewriting for Retrieval-Augmented Large Language Models

236

192

23 May 2023

Enhancing Chat Language Models by Scaling High-quality Instructional ConversationsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Zhiyuan Liu

Maosong Sun

Bowen Zhou

ALM

365

747

23 May 2023

Constrained Proximal Policy Optimization

113

23 May 2023

ChemGymRL: An Interactive Framework for Reinforcement Learning for Digital Chemistry

Chris Beeler

Sriram Ganapathi Subramanian

...

202

23 May 2023

Solving Stabilize-Avoid Optimal Control via Epigraph Form and Deep Reinforcement Learning

Oswin So

Chuchu Fan

162

23 May 2023

RLBoost: Boosting Supervised Models using Deep Reinforcement LearningNeurocomputing (Neurocomputing), 2023

Eloy Anguiano Batanero

Ángela Fernández Pascual

Á. Jiménez

OffRL

23 May 2023

Combining Multi-Objective Bayesian Optimization with Reinforcement Learning for TinyMLACM Transactions on Evolutionary Learning and Optimization (TELO), 2023

M. Deutel

G. Kontes

Christopher Mutschler

Jürgen Teich

519

23 May 2023

Constrained Reinforcement Learning for Dynamic Material HandlingIEEE International Joint Conference on Neural Network (IJCNN), 2023

169

23 May 2023

XRoute Environment: A Novel Reinforcement Learning Environment for Routing

109

23 May 2023

Proximal Policy Gradient Arborescence for Quality Diversity Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2023

263

23 May 2023

Optimizing Long-term Value for Auction-Based Recommender Systems via On-Policy Reinforcement LearningACM Conference on Recommender Systems (RecSys), 2023

305

23 May 2023

Aligning Large Language Models through Synthetic FeedbackConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

273

23 May 2023

Robust Model-Based Optimization for Challenging Fitness LandscapesInternational Conference on Learning Representations (ICLR), 2023

234

23 May 2023

Developmental Curiosity and Social Interaction in Virtual AgentsAnnual Meeting of the Cognitive Science Society (CogSci), 2023

135

22 May 2023

Training Diffusion Models with Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2023

578

640

22 May 2023

AlpacaFarm: A Simulation Framework for Methods that Learn from Human FeedbackNeural Information Processing Systems (NeurIPS), 2023

Jimmy Ba

Tatsunori B. Hashimoto

ALM

488

764

22 May 2023

Making Language Models Better Tool Learners with Execution FeedbackNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023

Huajun Chen

Ningyu Zhang

LLMAG

418

22 May 2023

Road Planning for Slums via Deep Reinforcement LearningKnowledge Discovery and Data Mining (KDD), 2023

Y. Zheng

Hongyuan Su

Jingtao Ding

Depeng Jin

Yong Li

300

22 May 2023

Yes, this Way! Learning to Ground Referring Expressions into Actions with Intra-episodic Feedback from Supportive TeachersAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

P. Sadler

Sherzod Hakimov

David Schlangen

295

22 May 2023

Testing of Deep Reinforcement Learning Agents with Surrogate ModelsACM Transactions on Software Engineering and Methodology (TOSEM), 2023

Matteo Biagiola

Paolo Tonella

248

22 May 2023

Multi-task Hierarchical Adversarial Inverse Reinforcement LearningInternational Conference on Machine Learning (ICML), 2023

210

22 May 2023

Strategy Extraction in Single-Agent Games

Archana Vadakattu

Michelle L. Blom

A. Pearce

172

22 May 2023

A Reinforcement Learning Approach for Robust Supervisory Control of UAVs Under Disturbances

Ibrahim Ahmed

Marcos Quiñones-Grueiro

Gautam Biswas

21 May 2023

BertRLFuzzer: A BERT and Reinforcement Learning Based FuzzerAAAI Conference on Artificial Intelligence (AAAI), 2023

280

21 May 2023

Synthesizing Diverse Human Motions in 3D Indoor ScenesIEEE International Conference on Computer Vision (ICCV), 2023

Siyu Tang

349

103

21 May 2023

DexPBT: Scaling up Dexterous Manipulation for Hand-Arm Systems with Population Based Training

188

20 May 2023

Vision-based DRL Autonomous Driving Agent with Sim2Real Transfer

Dian-Tao Li

Ostap Okhrin

283

19 May 2023

Learning Diverse Risk Preferences in Population-based Self-playAAAI Conference on Artificial Intelligence (AAAI), 2023

387

19 May 2023

Counterfactual Fairness Filter for Fair-Delay Multi-Robot NavigationAdaptive Agents and Multi-Agent Systems (AAMAS), 2023

196

19 May 2023

Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models

165

19 May 2023

A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and ValidationArtificial Intelligence Review (AIR), 2023

...

351

146

19 May 2023

Bayesian Reparameterization of Reward-Conditioned Reinforcement Learning with Energy-based ModelsInternational Conference on Machine Learning (ICML), 2023

Ding Zhao

143

18 May 2023

Constrained Environment Optimization for Prioritized Multi-Agent Navigation

Zhan Gao

Amanda Prorok

185

18 May 2023

Parallel development of social preferences in fish and machinesAnnual Meeting of the Cognitive Science Society (CogSci), 2023

Joshua McGraw

Donsuk Lee

Justin N. Wood

18 May 2023

LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis EvaluationNeural Information Processing Systems (NeurIPS), 2023

424

18 May 2023

From Data-Fitting to Discovery: Interpreting the Neural Dynamics of Motor Control through Reinforcement Learning

Eugene R. Rush

Kaushik Jayaram

J. Humbert

151

18 May 2023

Optimistic Natural Policy Gradient: a Simple Efficient Policy Optimization Framework for Online RLNeural Information Processing Systems (NeurIPS), 2023

211

18 May 2023

Deep Metric Tensor Regularized Policy Gradient

Gang Chen

Victoria Huang

215

18 May 2023

Sharing Lifelong Reinforcement Learning Knowledge via Modulating Masks

Saptarshi Nath

Christos Peridis

Eseoghene Ben-Iwhiwhu

180

18 May 2023

Reinforcement Learning for Legged Robots: Motion Imitation from Model-Based Optimal Control

161

18 May 2023

Client Selection for Federated Policy Optimization with Environment Heterogeneity

Zhijie Xie

S. H. Song

453

18 May 2023