ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.02834
  4. Cited By
MergeDistill: Merging Pre-trained Language Models using Distillation

MergeDistill: Merging Pre-trained Language Models using Distillation

5 June 2021
Simran Khanuja
Melvin Johnson
Partha P. Talukdar
ArXivPDFHTML

Papers citing "MergeDistill: Merging Pre-trained Language Models using Distillation"

8 / 8 papers shown
Title
Advantage-Guided Distillation for Preference Alignment in Small Language Models
Advantage-Guided Distillation for Preference Alignment in Small Language Models
Shiping Gao
Fanqi Wan
Jiajian Guo
Xiaojun Quan
Qifan Wang
ALM
58
0
0
25 Feb 2025
Stolen Subwords: Importance of Vocabularies for Machine Translation
  Model Stealing
Stolen Subwords: Importance of Vocabularies for Machine Translation Model Stealing
Vilém Zouhar
AAML
35
0
0
29 Jan 2024
Knowledge Fusion of Large Language Models
Knowledge Fusion of Large Language Models
Fanqi Wan
Xinting Huang
Deng Cai
Xiaojun Quan
Wei Bi
Shuming Shi
MoMe
27
61
0
19 Jan 2024
How Good is Your Tokenizer? On the Monolingual Performance of
  Multilingual Language Models
How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models
Phillip Rust
Jonas Pfeiffer
Ivan Vulić
Sebastian Ruder
Iryna Gurevych
69
234
0
31 Dec 2020
Rethinking embedding coupling in pre-trained language models
Rethinking embedding coupling in pre-trained language models
Hyung Won Chung
Thibault Févry
Henry Tsai
Melvin Johnson
Sebastian Ruder
93
142
0
24 Oct 2020
Improving Multilingual Models with Language-Clustered Vocabularies
Improving Multilingual Models with Language-Clustered Vocabularies
Hyung Won Chung
Dan Garrette
Kiat Chuan Tan
Jason Riesa
VLM
58
65
0
24 Oct 2020
What the [MASK]? Making Sense of Language-Specific BERT Models
What the [MASK]? Making Sense of Language-Specific BERT Models
Debora Nozza
Federico Bianchi
Dirk Hovy
84
105
0
05 Mar 2020
Google's Neural Machine Translation System: Bridging the Gap between
  Human and Machine Translation
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,740
0
26 Sep 2016
1