Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1712.02838
Cited By
End-to-End Offline Goal-Oriented Dialog Policy Learning via Policy Gradient
7 December 2017
Li Zhou
Kevin Small
Oleg Rokhlenko
Charles Elkan
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"End-to-End Offline Goal-Oriented Dialog Policy Learning via Policy Gradient"
8 / 8 papers shown
Title
Prompt-Based Length Controlled Generation with Reinforcement Learning
Renlong Jie
Xiaojun Meng
Lifeng Shang
Xin Jiang
Qun Liu
24
8
0
23 Aug 2023
KRLS: Improving End-to-End Response Generation in Task Oriented Dialog with Reinforced Keywords Learning
Xiao Yu
Qingyang Wu
Kun Qian
Zhou Yu
OffRL
21
11
0
30 Nov 2022
Jointly Reinforced User Simulator and Task-oriented Dialog System with Simplified Generative Architecture
Abhishek Sethi
Zhijian Ou
Yi Huang
Junlan Feng
RALM
21
1
0
13 Oct 2022
Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
Rajkumar Ramamurthy
Prithviraj Ammanabrolu
Kianté Brantley
Jack Hessel
R. Sifa
Christian Bauckhage
Hannaneh Hajishirzi
Yejin Choi
OffRL
31
240
0
03 Oct 2022
Reinforcement Learning of Multi-Domain Dialog Policies Via Action Embeddings
Jorge Armando Mendez Mendez
Alborz Geramifard
Mohammad Ghavamzadeh
Bing-Quan Liu
OffRL
27
6
0
01 Jul 2022
Behavioral Priors and Dynamics Models: Improving Performance and Domain Transfer in Offline RL
Catherine Cang
Aravind Rajeswaran
Pieter Abbeel
Michael Laskin
OffRL
32
29
0
16 Jun 2021
Deep Reinforcement Learning for Dialogue Generation
Jiwei Li
Will Monroe
Alan Ritter
Michel Galley
Jianfeng Gao
Dan Jurafsky
220
1,328
0
05 Jun 2016
Off-Policy Actor-Critic
T. Degris
Martha White
R. Sutton
OffRL
CML
163
220
0
22 May 2012
1