ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2501.02795
  4. Cited By
InfiFusion: A Unified Framework for Enhanced Cross-Model Reasoning via LLM Fusion
v1v2v3 (latest)

InfiFusion: A Unified Framework for Enhanced Cross-Model Reasoning via LLM Fusion

6 January 2025
Zhaoyi Yan
Zhijie Sang
Yiming Zhang
Yuhao Fu
Baoyi He
Qi Zhou
Yining Di
Chunlin Ji
Shengyu Zhang
Leilei Gan
    MoMeLRM
ArXiv (abs)PDFHTML

Papers citing "InfiFusion: A Unified Framework for Enhanced Cross-Model Reasoning via LLM Fusion"

15 / 15 papers shown
Bohdi: Heterogeneous LLM Fusion with Automatic Data Exploration
Bohdi: Heterogeneous LLM Fusion with Automatic Data Exploration
Junqi Gao
Zhichang Guo
Dazhi Zhang
Dong Li
Runze Liu
Pengfei Li
Kai Tian
Biqing Qi
394
0
0
04 Jun 2025
Towards Cross-Tokenizer Distillation: the Universal Logit Distillation Loss for LLMs
Towards Cross-Tokenizer Distillation: the Universal Logit Distillation Loss for LLMs
Nicolas Boizard
Kevin El Haddad
C´eline Hudelot
Pierre Colombo
442
26
0
28 Jan 2025
Cautious Optimizers: Improving Training with One Line of Code
Cautious Optimizers: Improving Training with One Line of Code
Kaizhao Liang
Lizhang Chen
B. Liu
Qiang Liu
ODL
694
20
0
25 Nov 2024
OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source
  Instruction Data
OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction DataInternational Conference on Learning Representations (ICLR), 2024
Shubham Toshniwal
Wei Du
Ivan Moshkov
Branislav Kisacanin
Alexan Ayrapetyan
Igor Gitman
LRM
413
123
0
02 Oct 2024
FuseChat: Knowledge Fusion of Chat Models
FuseChat: Knowledge Fusion of Chat Models
Fanqi Wan
Longguang Zhong
Ziyi Yang
Ruijun Chen
Xiaojun Quan
ALMKELMMoMe
353
38
0
15 Aug 2024
Qwen2 Technical Report
Qwen2 Technical Report
An Yang
Baosong Yang
Binyuan Hui
Jian Xu
Bowen Yu
...
Yuqiong Liu
Zeyu Cui
Zhenru Zhang
Zhifang Guo
Zhi-Wei Fan
OSLMVLMMU
628
1,685
0
15 Jul 2024
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code
  Intelligence
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
DeepSeek-AI
Qihao Zhu
Daya Guo
Zhihong Shao
Dejian Yang
...
Jiashi Li
Chenggang Zhao
Chong Ruan
Fuli Luo
Wenfeng Liang
MoELRMELMVLM
306
358
0
17 Jun 2024
Baby Llama: knowledge distillation from an ensemble of teachers trained
  on a small dataset with no performance penalty
Baby Llama: knowledge distillation from an ensemble of teachers trained on a small dataset with no performance penalty
I. Timiryasov
J. Tastet
350
72
0
03 Aug 2023
Editing Models with Task Arithmetic
Editing Models with Task ArithmeticInternational Conference on Learning Representations (ICLR), 2022
Gabriel Ilharco
Marco Tulio Ribeiro
Mitchell Wortsman
Suchin Gururangan
Ludwig Schmidt
Hannaneh Hajishirzi
Ali Farhadi
KELMMoMeMU
1.2K
734
0
08 Dec 2022
Training Verifiers to Solve Math Word Problems
Training Verifiers to Solve Math Word Problems
K. Cobbe
V. Kosaraju
Mohammad Bavarian
Mark Chen
Heewoo Jun
...
Jerry Tworek
Jacob Hilton
Reiichiro Nakano
Christopher Hesse
John Schulman
ReLMOffRLLRM
1.1K
6,736
0
27 Oct 2021
Evaluating Large Language Models Trained on Code
Evaluating Large Language Models Trained on Code
Mark Chen
Jerry Tworek
Heewoo Jun
Qiming Yuan
Henrique Pondé
...
Bob McGrew
Dario Amodei
Sam McCandlish
Ilya Sutskever
Wojciech Zaremba
ELMALM
1.9K
7,722
0
07 Jul 2021
Measuring Mathematical Problem Solving With the MATH Dataset
Measuring Mathematical Problem Solving With the MATH Dataset
Dan Hendrycks
Collin Burns
Saurav Kadavath
Akul Arora
Steven Basart
Eric Tang
Basel Alomair
Jacob Steinhardt
ReLMFaML
903
3,885
0
05 Mar 2021
Language Models are Few-Shot Learners
Language Models are Few-Shot LearnersNeural Information Processing Systems (NeurIPS), 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
2.0K
52,173
0
28 May 2020
Sequence-Level Knowledge Distillation
Sequence-Level Knowledge DistillationConference on Empirical Methods in Natural Language Processing (EMNLP), 2016
Yoon Kim
Alexander M. Rush
447
1,199
0
25 Jun 2016
Distilling the Knowledge in a Neural Network
Distilling the Knowledge in a Neural Network
Geoffrey E. Hinton
Oriol Vinyals
J. Dean
FedML
797
22,387
0
09 Mar 2015
1