ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1707.09118
  4. Cited By
Counterfactual Learning from Bandit Feedback under Deterministic
  Logging: A Case Study in Statistical Machine Translation

Counterfactual Learning from Bandit Feedback under Deterministic Logging: A Case Study in Statistical Machine Translation

28 July 2017
Carolin (Haas) Lawrence
Artem Sokolov
Stefan Riezler
    OffRL
ArXivPDFHTML

Papers citing "Counterfactual Learning from Bandit Feedback under Deterministic Logging: A Case Study in Statistical Machine Translation"

2 / 2 papers shown
Title
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning
Philip S. Thomas
Emma Brunskill
OffRL
225
573
0
04 Apr 2016
ADADELTA: An Adaptive Learning Rate Method
ADADELTA: An Adaptive Learning Rate Method
Matthew D. Zeiler
ODL
113
6,619
0
22 Dec 2012
1