ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2409.12500
  4. Cited By
LLMR: Knowledge Distillation with a Large Language Model-Induced Reward

LLMR: Knowledge Distillation with a Large Language Model-Induced Reward

19 September 2024
Dongheng Li
Yongchang Hao
Lili Mou
ArXivPDFHTML

Papers citing "LLMR: Knowledge Distillation with a Large Language Model-Induced Reward"

10 / 10 papers shown
Title
KETCHUP: K-Step Return Estimation for Sequential Knowledge Distillation
KETCHUP: K-Step Return Estimation for Sequential Knowledge Distillation
Jiabin Fan
Guoqing Luo
Michael Bowling
Lili Mou
OffRL
61
0
0
26 Apr 2025
EBBS: An Ensemble with Bi-Level Beam Search for Zero-Shot Machine Translation
EBBS: An Ensemble with Bi-Level Beam Search for Zero-Shot Machine Translation
Yuqiao Wen
Behzad Shayegh
Chenyang Huang
Yanshuai Cao
Lili Mou
29
4
0
29 Feb 2024
Prompting and Evaluating Large Language Models for Proactive Dialogues:
  Clarification, Target-guided, and Non-collaboration
Prompting and Evaluating Large Language Models for Proactive Dialogues: Clarification, Target-guided, and Non-collaboration
Yang Deng
Lizi Liao
Liang Chen
Hongru Wang
Wenqiang Lei
Tat-Seng Chua
56
45
0
23 May 2023
Distilling Step-by-Step! Outperforming Larger Language Models with Less
  Training Data and Smaller Model Sizes
Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes
Lokesh Nagalapatti
Chun-Liang Li
Chih-Kuan Yeh
Hootan Nakhost
Yasuhisa Fujii
Alexander Ratner
Ranjay Krishna
Chen-Yu Lee
Tomas Pfister
ALM
196
283
0
03 May 2023
LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale
  Instructions
LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions
Minghao Wu
Abdul Waheed
Chiyu Zhang
Muhammad Abdul-Mageed
Alham Fikri Aji
ALM
115
115
0
27 Apr 2023
Few-shot training LLMs for project-specific code-summarization
Few-shot training LLMs for project-specific code-summarization
Toufique Ahmed
Prem Devanbu
163
137
0
09 Jul 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
Multitask Prompted Training Enables Zero-Shot Task Generalization
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
203
1,651
0
15 Oct 2021
Relating Neural Text Degeneration to Exposure Bias
Relating Neural Text Degeneration to Exposure Bias
Ting-Rui Chiang
Yun-Nung Chen
32
16
0
17 Sep 2021
Fine-Tuning Language Models from Human Preferences
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
273
1,561
0
18 Sep 2019
1