Improving Machine Translation with Human Feedback: An Exploration of
Quality Estimation as a Reward Model

Improving Machine Translation with Human Feedback: An Exploration of Quality Estimation as a Reward Model

23 January 2024

Zhuosheng Zhang

Rui Wang

Papers citing "Improving Machine Translation with Human Feedback: An Exploration of Quality Estimation as a Reward Model"

3 / 3 papers shown

Title
MBR and QE Finetuning: Training-time Distillation of the Best and Most Expensive Decoding Methods M. Finkelstein Subhajit Naskar Mehdi Mirzazadeh Apurva Shah Markus Freitag 42 26 0 19 Sep 2023
Learning Evaluation Models from Large Language Models for Sequence Generation Chenglong Wang Hang Zhou Kai-Chun Chang Tongran Liu Chunliang Zhang Quan Du Tong Xiao Yue Zhang Jingbo Zhu ELM 31 3 0 08 Aug 2023
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 301 11,730 0 04 Mar 2022