LLMR: Knowledge Distillation with a Large Language Model-Induced Reward

LLMR: Knowledge Distillation with a Large Language Model-Induced Reward

19 September 2024

Yongchang Hao

Lili Mou

Papers citing "LLMR: Knowledge Distillation with a Large Language Model-Induced Reward"

10 / 10 papers shown

Title
KETCHUP: K-Step Return Estimation for Sequential Knowledge Distillation Jiabin Fan Guoqing Luo Michael Bowling Lili Mou OffRL 61 0 0 26 Apr 2025
EBBS: An Ensemble with Bi-Level Beam Search for Zero-Shot Machine Translation Yuqiao Wen Behzad Shayegh Chenyang Huang Yanshuai Cao Lili Mou 29 4 0 29 Feb 2024
Prompting and Evaluating Large Language Models for Proactive Dialogues: Clarification, Target-guided, and Non-collaboration Yang Deng Lizi Liao Liang Chen Hongru Wang Wenqiang Lei Tat-Seng Chua 56 45 0 23 May 2023
Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes Lokesh Nagalapatti Chun-Liang Li Chih-Kuan Yeh Hootan Nakhost Yasuhisa Fujii Alexander Ratner Ranjay Krishna Chen-Yu Lee Tomas Pfister ALM 196 283 0 03 May 2023
LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions Minghao Wu Abdul Waheed Chiyu Zhang Muhammad Abdul-Mageed Alham Fikri Aji ALM 115 115 0 27 Apr 2023
Few-shot training LLMs for project-specific code-summarization Toufique Ahmed Prem Devanbu 163 137 0 09 Jul 2022
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 301 11,730 0 04 Mar 2022
Multitask Prompted Training Enables Zero-Shot Task Generalization Victor Sanh Albert Webson Colin Raffel Stephen H. Bach Lintang Sutawika ... T. Bers Stella Biderman Leo Gao Thomas Wolf Alexander M. Rush LRM 203 1,651 0 15 Oct 2021
Relating Neural Text Degeneration to Exposure Bias Ting-Rui Chiang Yun-Nung Chen 32 16 0 17 Sep 2021
Fine-Tuning Language Models from Human Preferences Daniel M. Ziegler Nisan Stiennon Jeff Wu Tom B. Brown Alec Radford Dario Amodei Paul Christiano G. Irving ALM 273 1,561 0 18 Sep 2019