Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2401.10491
Cited By
v1
v2 (latest)
Knowledge Fusion of Large Language Models
19 January 2024
Fanqi Wan
Xinting Huang
Deng Cai
Xiaojun Quan
Wei Bi
Shuming Shi
MoMe
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (5 upvotes)
Github (569★)
Papers citing
"Knowledge Fusion of Large Language Models"
50 / 69 papers shown
ColMate: Contrastive Late Interaction and Masked Text for Multimodal Document Retrieval
Ahmed Masry
Megh Thakkar
Patrice Bechard
Sathwik Tejaswi Madhusudhan
Rabiul Awal
...
Srivatsava Daruru
Enamul Hoque
Spandana Gella
Torsten Scholak
Sai Rajeswar
VLM
191
1
0
02 Nov 2025
AgentFrontier: Expanding the Capability Frontier of LLM Agents with ZPD-Guided Data Synthesis
Xuanzhong Chen
Zile Qiao
Guoxin Chen
L. Su
Zhen Zhang
Xinyu Wang
Pengjun Xie
Fei Huang
Jingren Zhou
Yong Jiang
LLMAG
ELM
164
3
0
28 Oct 2025
Teaming LLMs to Detect and Mitigate Hallucinations
Demian Till
John Smeaton
Peter Haubrick
Gouse Saheb
Florian Graef
David Berman
HILM
320
0
0
22 Oct 2025
Beyond Single Models: Mitigating Multimodal Hallucinations via Adaptive Token Ensemble Decoding
Jinlin Li
Y. X. R. Wang
Yifei Yuan
Xiao Zhou
Y. Zhang
Xixian Yong
Yefeng Zheng
X. Wu
MLLM
151
0
0
21 Oct 2025
Lossless Vocabulary Reduction for Auto-Regressive Language Models
Daiki Chijiwa
Taku Hasegawa
Kyosuke Nishida
Shinýa Yamaguchi
Tomoya Ohba
Tamao Sakao
Susumu Takeuchi
104
1
0
09 Oct 2025
TiTok: Transfer Token-level Knowledge via Contrastive Excess to Transplant LoRA
Chanjoo Jung
Jaehyung Kim
149
0
0
06 Oct 2025
Making, not Taking, the Best of N
Ammar Khairi
Daniel D'souza
Marzieh Fadaee
Julia Kreutzer
MoMe
143
0
0
01 Oct 2025
Mixture of Thoughts: Learning to Aggregate What Experts Think, Not Just What They Say
Jacob Fein-Ashley
Dhruv Parikh
Rajgopal Kannan
Viktor Prasanna
MoE
MoMe
LRM
177
2
0
25 Sep 2025
Probabilistic Token Alignment for Large Language Model Fusion
Runjia Zeng
James Liang
Cheng Han
Zhiwen Cao
Jiahao Liu
...
Yingjie Victor Chen
Lifu Huang
Tong Geng
Qifan Wang
Dongfang Liu
164
1
0
21 Sep 2025
World Model Implanting for Test-time Adaptation of Embodied Agents
Minjong Yoo
Jinwoo Jang
Sihyung Yoon
Honguk Woo
LM&Ro
119
1
0
04 Sep 2025
IAENet: An Importance-Aware Ensemble Model for 3D Point Cloud-Based Anomaly Detection
Xuanming Cao
Chengyu Tao
Yifeng Cheng
Juan Du
3DPC
117
0
0
28 Aug 2025
A Taxonomy of Transcendence
Natalie Abreu
Edwin Zhang
Eran Malach
Sham Kakade
133
2
0
25 Aug 2025
Industrial LLM-based Code Optimization under Regulation: A Mixture-of-Agents Approach
Mari Ashiga
Vardan K. Voskanyan
Fateme Dinmohammadi
Jingzhi Gong
P. Brookes
Matthew Truscott
Rafail Giavrimis
Mike Basios
Leslie Kanthan
Wei Jie
144
1
0
05 Aug 2025
Bohdi: Heterogeneous LLM Fusion with Automatic Data Exploration
Junqi Gao
Zhichang Guo
Dazhi Zhang
Dong Li
Runze Liu
Pengfei Li
Kai Tian
Biqing Qi
399
0
0
04 Jun 2025
Linear Representation Transferability Hypothesis: Leveraging Small Models to Steer Large Models
Femi Bello
Anubrata Das
Fanzhi Zeng
Fangcong Yin
Liu Leqi
LLMSV
286
1
0
31 May 2025
LightRouter: Towards Efficient LLM Collaboration with Minimal Overhead
Yifan Zhang
Xinkui Zhao
Zuxin Wang
Guanjie Cheng
Yueshen Xu
Shuiguang Deng
Yuxiang Cai
226
3
0
22 May 2025
InfiFPO: Implicit Model Fusion via Preference Optimization in Large Language Models
Yanggan Gu
Zhaoyi Yan
Yuanyi Wang
Yiming Zhang
Qi Zhou
Leilei Gan
Hongxia Yang
294
2
0
20 May 2025
A Survey on Collaborative Mechanisms Between Large and Small Language Models
Yi Chen
JiaHao Zhao
HaoHao Han
380
10
0
12 May 2025
A Weighted Byzantine Fault Tolerance Consensus Driven Trusted Multiple Large Language Models Network
IEEE Transactions on Cognitive Communications and Networking (TCCN), 2025
Haoxiang Luo
Gang Sun
Yinqiu Liu
Dongcheng Zhao
Dusit Niyato
Hongfang Yu
Schahram Dustdar
228
9
0
08 May 2025
Towards Harnessing the Collaborative Power of Large and Small Models for Domain Tasks
Yang Liu
Bingjie Yan
Tianyuan Zou
Jianqing Zhang
Zixuan Gu
...
Jiajian Li
Xiaozhou Ye
Ye Ouyang
Qiang Yang
Yanzhe Zhang
ALM
1.0K
3
0
24 Apr 2025
DMM: Building a Versatile Image Generation Model via Distillation-Based Model Merging
Tianhui Song
Weixin Feng
Shuai Wang
Guojian Pang
Bangyu Xiang
Bo Zheng
Limin Wang
MoMe
340
4
0
16 Apr 2025
Leveraging Submodule Linearity Enhances Task Arithmetic Performance in LLMs
International Conference on Learning Representations (ICLR), 2025
Rui Dai
Sile Hu
Xu Shen
Yonggang Zhang
Xinmei Tian
Jieping Ye
MoMe
314
6
0
15 Apr 2025
A Dual-Space Framework for General Knowledge Distillation of Large Language Models
Wei Wei
Songming Zhang
Yunlong Liang
Fandong Meng
Yufeng Chen
Jinan Xu
Jie Zhou
374
0
0
15 Apr 2025
FuseRL: Dense Preference Optimization for Heterogeneous Model Fusion
Longguang Zhong
Fanqi Wan
Ziyi Yang
Guosheng Liang
Tianyuan Shi
Xiaojun Quan
MoMe
321
1
0
09 Apr 2025
LeForecast: Enterprise Hybrid Forecast by Time Series Intelligence
Zheng Tan
Yiwen Nie
Wenfa Wu
Guanyu Zhang
Yanze Liu
...
Chao Yang
Jiaxuan Fan
Yuan He
Hongsheng Qi
Yangzhou Du
AI4TS
245
0
0
27 Mar 2025
Overcoming Vocabulary Mismatch: Vocabulary-agnostic Teacher Guided Language Modeling
Haebin Shin
Lei Ji
Xiao Liu
Yeyun Gong
338
2
0
24 Mar 2025
Ensemble Learning for Large Language Models in Text and Code Generation: A Survey
Mari Ashiga
Wei Jie
Fan Wu
Vardan K. Voskanyan
Fateme Dinmohammadi
P. Brookes
Jingzhi Gong
Zheng Wang
332
8
0
13 Mar 2025
Collaborative Speculative Inference for Efficient LLM Inference Serving
Luyao Gao
Jianchun Liu
Hongli Xu
Xichong Zhang
Yunming Liao
Liusheng Huang
329
3
0
13 Mar 2025
System 0/1/2/3: Quad-process theory for multi-timescale embodied collective cognitive systems
Tadahiro Taniguchi
Yasushi Hirai
Masahiro Suzuki
Shingo Murata
Takato Horii
Kazutoshi Tanaka
AI4CE
345
3
0
08 Mar 2025
Rethinking Data: Towards Better Performing Domain-Specific Small Language Models
Boris Nazarov
Darya Frolova
Yackov Lubarsky
Alexei Gaissinski
Pavel Kisilev
ALM
248
1
0
03 Mar 2025
Scalable Model Merging with Progressive Layer-wise Distillation
Jing Xu
Jiazheng Li
J.N. Zhang
MoMe
FedML
645
8
0
18 Feb 2025
Speculate, then Collaborate: Fusing Knowledge of Language Models during Decoding
Liang Luo
Muneeza Azmart
Ang Li
R. Horesh
Mikhail Yurochkin
427
6
0
11 Feb 2025
TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models
International Conference on Learning Representations (ICLR), 2025
Makoto Shing
Yuichi Inoue
Han Bao
Sho Yokoi
Takuya Akiba
VLM
582
11
0
28 Jan 2025
Multi-Task Model Merging via Adaptive Weight Disentanglement
Feng Xiong
Runxi Cheng
Wang Chen
Zhanqiu Zhang
Yiwen Guo
Chun Yuan
Ruifeng Xu
MoMe
582
11
0
10 Jan 2025
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
Bradley Brown
Jordan Juravsky
Ryan Ehrlich
Ronald Clark
Quoc V. Le
Christopher Ré
Azalia Mirhoseini
ALM
LRM
931
571
0
03 Jan 2025
Copyright-Protected Language Generation via Adaptive Model Fusion
International Conference on Learning Representations (ICLR), 2024
Javier Abad
Konstantin Donhauser
Francesco Pinto
Fanny Yang
342
3
0
09 Dec 2024
Enhancing Perception Capabilities of Multimodal LLMs with Training-Free Fusion
Zhuokun Chen
Jinwu Hu
Zeshuai Deng
Yufeng Wang
Bohan Zhuang
Zhuliang Yu
369
1
0
02 Dec 2024
H3Fusion: Helpful, Harmless, Honest Fusion of Aligned LLMs
Selim Furkan Tekin
Fatih Ilhan
Tiansheng Huang
Sihao Hu
Yichang Xu
Zachary Yahn
Ling Liu
MoMe
363
8
0
26 Nov 2024
Exploring Model Kinship for Merging Large Language Models
Yedi Hu
Yunzhi Yao
Ningyu Zhang
Shumin Deng
Ningyu Zhang
MoMe
473
1
0
16 Oct 2024
Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence
Shangbin Feng
Zifeng Wang
Yike Wang
Sayna Ebrahimi
Hamid Palangi
...
Nathalie Rauschmayr
Yejin Choi
Yulia Tsvetkov
Zifeng Wang
Tomas Pfister
MoMe
309
16
0
15 Oct 2024
Exact Byte-Level Probabilities from Tokenized Language Models for FIM-Tasks and Model Ensembles
International Conference on Learning Representations (ICLR), 2024
Buu Phan
Brandon Amos
Itai Gat
Marton Havasi
Matthew Muckley
Karen Ullrich
272
10
0
11 Oct 2024
Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild
Neural Information Processing Systems (NeurIPS), 2024
Xinyu Zhao
Zheyu Shen
Ruisi Cai
Yukun Zhou
Pingzhi Li
...
Binhang Yuan
Hongyi Wang
Ang Li
Zhangyang Wang
Tianlong Chen
MoMe
ALM
331
10
0
07 Oct 2024
What Matters for Model Merging at Scale?
Prateek Yadav
Tu Vu
Jonathan Lai
Alexandra Chronopoulou
Manaal Faruqui
Joey Tianyi Zhou
Tsendsuren Munkhdalai
MoMe
269
43
0
04 Oct 2024
Parameter Competition Balancing for Model Merging
Neural Information Processing Systems (NeurIPS), 2024
Guodong DU
Junlin Lee
Jing Li
Runhua Jiang
Yifei Guo
...
Hanting Liu
Sim Kuan Goh
Jing Li
Daojing He
Min Zhang
MoMe
244
43
0
03 Oct 2024
Determine-Then-Ensemble: Necessity of Top-k Union for Large Language Model Ensembling
International Conference on Learning Representations (ICLR), 2024
Yuxuan Yao
Han Wu
Mingyang Liu
Sichun Luo
Xiongwei Han
Jie Liu
Zhijiang Guo
Linqi Song
235
18
0
03 Oct 2024
Disentangling Latent Shifts of In-Context Learning with Weak Supervision
Josip Jukić
Jan Snajder
297
1
0
02 Oct 2024
SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models
Anke Tang
Li Shen
Yong Luo
Shuai Xie
Han Hu
Lefei Zhang
Di Lin
Dacheng Tao
MoMe
310
9
0
19 Aug 2024
FuseChat: Knowledge Fusion of Chat Models
Fanqi Wan
Longguang Zhong
Ziyi Yang
Ruijun Chen
Xiaojun Quan
ALM
KELM
MoMe
357
38
0
15 Aug 2024
Computer Audition: From Task-Specific Machine Learning to Foundation Models
Andreas Triantafyllopoulos
Iosif Tsangko
Alexander Gebhard
A. Mesaros
Maria Sandsten
B. Schuller
403
7
0
22 Jul 2024
Recent Advances in Generative AI and Large Language Models: Current Status, Challenges, and Perspectives
D. Hagos
Rick Battle
Danda B. Rawat
LM&MA
OffRL
485
84
0
20 Jul 2024
1
2
Next