Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1707.09118
Cited By
Counterfactual Learning from Bandit Feedback under Deterministic Logging: A Case Study in Statistical Machine Translation
28 July 2017
Carolin (Haas) Lawrence
Artem Sokolov
Stefan Riezler
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Counterfactual Learning from Bandit Feedback under Deterministic Logging: A Case Study in Statistical Machine Translation"
2 / 2 papers shown
Title
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning
Philip S. Thomas
Emma Brunskill
OffRL
225
573
0
04 Apr 2016
ADADELTA: An Adaptive Learning Rate Method
Matthew D. Zeiler
ODL
113
6,619
0
22 Dec 2012
1