Bandit Structured Prediction for Learning from Partial Feedback in Statistical Machine Translation

18 January 2016

Papers citing "Bandit Structured Prediction for Learning from Partial Feedback in Statistical Machine Translation"

9 / 9 papers shown

Reinforcement learning

Florentin Wörgötter

734

3,169

16 May 2024

AlpacaFarm: A Simulation Framework for Methods that Learn from Human FeedbackNeural Information Processing Systems (NeurIPS), 2023

Jimmy Ba

Tatsunori B. Hashimoto

ALM

654

831

22 May 2023

Bandits Don't Follow Rules: Balancing Multi-Facet Machine Translation with Multi-Armed Bandits

Julia Kreutzer

David Vilar

Artem Sokolov

267

13 Oct 2021

Survey on reinforcement learning for language processingArtificial Intelligence Review (AIR), 2021

Víctor Uc Cetina

Nicolás Navarro-Guerrero

376

141

12 Apr 2021

Warm-starting Contextual Bandits: Robustly Combining Supervised and Bandit Feedback

432

02 Jan 2019

Preference-based Online Learning with Dueling Bandits: A Survey

Viktor Bengs

R. Busa-Fekete

Adil El Mesaoudi-Paul

Eyke Hüllermeier

486

133

30 Jul 2018

A Shared Task on Bandit Learning for Machine Translation

173

27 Jul 2017

Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback

Khanh Nguyen

Hal Daumé

Jordan L. Boyd-Graber

444

146

24 Jul 2017

Stochastic Structured Prediction under Bandit FeedbackNeural Information Processing Systems (NeurIPS), 2016

150

02 Jun 2016