ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.02415
  4. Cited By
LLaMA Pro: Progressive LLaMA with Block Expansion

LLaMA Pro: Progressive LLaMA with Block Expansion

4 January 2024
Chengyue Wu
Yukang Gan
Yixiao Ge
Zeyu Lu
Jiahao Wang
Ye Feng
Ying Shan
Ping Luo
    CLL
ArXivPDFHTML

Papers citing "LLaMA Pro: Progressive LLaMA with Block Expansion"

50 / 56 papers shown
Title
SEFE: Superficial and Essential Forgetting Eliminator for Multimodal Continual Instruction Tuning
SEFE: Superficial and Essential Forgetting Eliminator for Multimodal Continual Instruction Tuning
Jinpeng Chen
Runmin Cong
Yuzhi Zhao
Hongzheng Yang
Guangneng Hu
H. Ip
Sam Kwong
CLL
KELM
63
0
0
05 May 2025
RWKV-X: A Linear Complexity Hybrid Language Model
RWKV-X: A Linear Complexity Hybrid Language Model
Haowen Hou
Zhiyi Huang
Kaifeng Tan
Rongchang Lu
Fei Richard Yu
VLM
78
0
0
30 Apr 2025
Kuwain 1.5B: An Arabic SLM via Language Injection
Kuwain 1.5B: An Arabic SLM via Language Injection
Khalil Hennara
Sara Chrouf
Mohamed Motaism Hamed
Zeina Aldallal
Omar Hadid
Safwan AlModhayan
29
1
0
21 Apr 2025
Enhancing knowledge retention for continual learning with domain-specific adapters and features gating
Enhancing knowledge retention for continual learning with domain-specific adapters and features gating
Mohamed Abbas Hedjazi
O. Hadjerci
Adel Hafiane
CLL
18
0
0
11 Apr 2025
Llama-3-Nanda-10B-Chat: An Open Generative Large Language Model for Hindi
Llama-3-Nanda-10B-Chat: An Open Generative Large Language Model for Hindi
Monojit Choudhury
Shivam Chauhan
Rocktim Jyoti Das
Dhruv Sahnan
Xudong Han
...
Rituraj Joshi
Gurpreet Gosal
Avraham Sheinin
Natalia Vassilieva
Preslav Nakov
21
0
0
08 Apr 2025
STEP: Staged Parameter-Efficient Pre-training for Large Language Models
STEP: Staged Parameter-Efficient Pre-training for Large Language Models
Kazuki Yano
Takumi Ito
Jun Suzuki
LRM
47
1
0
05 Apr 2025
Neuroplasticity in Artificial Intelligence -- An Overview and Inspirations on Drop In & Out Learning
Neuroplasticity in Artificial Intelligence -- An Overview and Inspirations on Drop In & Out Learning
Yupei Li
M. Milling
Björn Schuller
AI4CE
102
0
0
27 Mar 2025
Adding Alignment Control to Language Models
Wenhong Zhu
Weinan Zhang
Rui Wang
50
0
0
06 Mar 2025
LoRA-Null: Low-Rank Adaptation via Null Space for Large Language Models
Pengwei Tang
Y. Liu
Dongjie Zhang
Xing Wu
Debing Zhang
57
0
0
04 Mar 2025
LESA: Learnable LLM Layer Scaling-Up
LESA: Learnable LLM Layer Scaling-Up
Yifei Yang
Zouying Cao
Xinbei Ma
Yao Yao
L. Qin
Z. Chen
Hai Zhao
59
0
0
20 Feb 2025
Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs
Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs
Longxu Dou
Qian Liu
Fan Zhou
Changyu Chen
Zili Wang
...
Tianyu Pang
Chao Du
Xinyi Wan
Wei Lu
Min Lin
86
1
0
18 Feb 2025
Unlocking the Power of Function Vectors for Characterizing and Mitigating Catastrophic Forgetting in Continual Instruction Tuning
Unlocking the Power of Function Vectors for Characterizing and Mitigating Catastrophic Forgetting in Continual Instruction Tuning
Gangwei Jiang
Caigao Jiang
Zhaoyi Li
Siqiao Xue
Jun-ping Zhou
Linqi Song
Defu Lian
Yin Wei
CLL
MU
56
0
0
16 Feb 2025
Speculate, then Collaborate: Fusing Knowledge of Language Models during Decoding
Speculate, then Collaborate: Fusing Knowledge of Language Models during Decoding
Z. Wang
Muneeza Azmart
Ang Li
R. Horesh
Mikhail Yurochkin
107
1
0
11 Feb 2025
MergeME: Model Merging Techniques for Homogeneous and Heterogeneous MoEs
MergeME: Model Merging Techniques for Homogeneous and Heterogeneous MoEs
Yuhang Zhou
Giannis Karamanolakis
Victor Soto
Anna Rumshisky
Mayank Kulkarni
Furong Huang
Wei Ai
Jianhua Lu
MoMe
101
0
0
03 Feb 2025
Bridging Interpretability and Robustness Using LIME-Guided Model
  Refinement
Bridging Interpretability and Robustness Using LIME-Guided Model Refinement
Navid Nayyem
Abdullah Rakin
Longwei Wang
AAML
FAtt
60
0
0
25 Dec 2024
MoD: A Distribution-Based Approach for Merging Large Language Models
MoD: A Distribution-Based Approach for Merging Large Language Models
Quy-Anh Dang
Chris Ngo
MoMe
VLM
21
0
0
01 Nov 2024
Exploring Forgetting in Large Language Model Pre-Training
Exploring Forgetting in Large Language Model Pre-Training
Chonghua Liao
Ruobing Xie
X. Sun
Haowen Sun
Zhanhui Kang
CLL
30
0
0
22 Oct 2024
FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion
  Model
FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion Model
ZiDong Wang
Zeyu Lu
Di Huang
Cai Zhou
Wanli Ouyang
and Lei Bai
69
3
0
17 Oct 2024
Router-Tuning: A Simple and Effective Approach for Enabling Dynamic-Depth in Transformers
Router-Tuning: A Simple and Effective Approach for Enabling Dynamic-Depth in Transformers
Shwai He
Tao Ge
Guoheng Sun
Bowei Tian
Xiaoyang Wang
Ang Li
MoE
46
1
0
17 Oct 2024
Upcycling Large Language Models into Mixture of Experts
Upcycling Large Language Models into Mixture of Experts
Ethan He
Abhinav Khattar
R. Prenger
V. Korthikanti
Zijie Yan
Tong Liu
Shiqing Fan
Ashwath Aithal
M. Shoeybi
Bryan Catanzaro
MoE
17
9
0
10 Oct 2024
Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild
Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild
Xinyu Zhao
Guoheng Sun
Ruisi Cai
Yukun Zhou
Pingzhi Li
...
Binhang Yuan
Hongyi Wang
Ang Li
Zhangyang Wang
Tianlong Chen
MoMe
ALM
26
2
0
07 Oct 2024
Neutral residues: revisiting adapters for model extension
Neutral residues: revisiting adapters for model extension
Franck Signe Talla
Hervé Jégou
Edouard Grave
20
0
0
03 Oct 2024
Llama SLayer 8B: Shallow Layers Hold the Key to Knowledge Injection
Llama SLayer 8B: Shallow Layers Hold the Key to Knowledge Injection
T. Chen
Zhentao Tan
Tao Gong
Yue Wu
Qi Chu
Bin Liu
Jieping Ye
Nenghai Yu
KELM
47
3
0
03 Oct 2024
Unlocking Memorization in Large Language Models with Dynamic Soft
  Prompting
Unlocking Memorization in Large Language Models with Dynamic Soft Prompting
Zhepeng Wang
Runxue Bao
Yawen Wu
Jackson Taylor
Cao Xiao
Feng Zheng
Weiwen Jiang
Shangqian Gao
Yanfu Zhang
PILM
36
7
0
20 Sep 2024
CamelEval: Advancing Culturally Aligned Arabic Language Models and
  Benchmarks
CamelEval: Advancing Culturally Aligned Arabic Language Models and Benchmarks
Zhaozhi Qian
Faroq Altam
Muhammad Alqurishi
Riad Souissi
24
1
0
19 Sep 2024
A Practice of Post-Training on Llama-3 70B with Optimal Selection of
  Additional Language Mixture Ratio
A Practice of Post-Training on Llama-3 70B with Optimal Selection of Additional Language Mixture Ratio
Ningyuan Xi
Yetao Wu
Kun Fan
Teng Chen
Qingqing Gu
...
Jinxian Qu
Chenxi Liu
Zhonglin Jiang
Yong Chen
Luo Ji
ALM
27
0
0
10 Sep 2024
MoE-LPR: Multilingual Extension of Large Language Models through
  Mixture-of-Experts with Language Priors Routing
MoE-LPR: Multilingual Extension of Large Language Models through Mixture-of-Experts with Language Priors Routing
Hao Zhou
Zhijun Wang
Shujian Huang
Xin Huang
Xue Han
Junlan Feng
Chao Deng
Weihua Luo
Jiajun Chen
CLL
MoE
44
5
0
21 Aug 2024
Reuse, Don't Retrain: A Recipe for Continued Pretraining of Language
  Models
Reuse, Don't Retrain: A Recipe for Continued Pretraining of Language Models
Jupinder Parmar
Sanjev Satheesh
M. Patwary
M. Shoeybi
Bryan Catanzaro
43
27
0
09 Jul 2024
DLO: Dynamic Layer Operation for Efficient Vertical Scaling of LLMs
DLO: Dynamic Layer Operation for Efficient Vertical Scaling of LLMs
Zhen Tan
Daize Dong
Xinyu Zhao
Jie Peng
Yu Cheng
Tianlong Chen
MoE
37
4
0
03 Jul 2024
Understand What LLM Needs: Dual Preference Alignment for
  Retrieval-Augmented Generation
Understand What LLM Needs: Dual Preference Alignment for Retrieval-Augmented Generation
Guanting Dong
Yutao Zhu
Chenghao Zhang
Zechen Wang
Zhicheng Dou
Ji-Rong Wen
RALM
42
10
0
26 Jun 2024
Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs
Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs
Ashwinee Panda
Berivan Isik
Xiangyu Qi
Sanmi Koyejo
Tsachy Weissman
Prateek Mittal
MoMe
45
12
0
24 Jun 2024
Geneverse: A collection of Open-source Multimodal Large Language Models
  for Genomic and Proteomic Research
Geneverse: A collection of Open-source Multimodal Large Language Models for Genomic and Proteomic Research
Tianyu Liu
Yijia Xiao
Xiao Luo
Hua Xu
W. Zheng
Hongyu Zhao
34
3
0
21 Jun 2024
Preserving Knowledge in Large Language Model with Model-Agnostic
  Self-Decompression
Preserving Knowledge in Large Language Model with Model-Agnostic Self-Decompression
Zilun Zhang
Yutao Sun
Tiancheng Zhao
Leigang Sha
Ruochen Xu
Kyusong Lee
Jianwei Yin
CLL
KELM
46
0
0
17 Jun 2024
Towards Lifelong Learning of Large Language Models: A Survey
Towards Lifelong Learning of Large Language Models: A Survey
Junhao Zheng
Shengjie Qiu
Chengming Shi
Qianli Ma
KELM
CLL
28
14
0
10 Jun 2024
CorDA: Context-Oriented Decomposition Adaptation of Large Language Models for Task-Aware Parameter-Efficient Fine-tuning
CorDA: Context-Oriented Decomposition Adaptation of Large Language Models for Task-Aware Parameter-Efficient Fine-tuning
Yibo Yang
Xiaojie Li
Zhongzhu Zhou
S. Song
Jianlong Wu
Liqiang Nie
Bernard Ghanem
45
6
0
07 Jun 2024
Large Language Model Pruning
Large Language Model Pruning
Hanjuan Huang
Hao-Jia Song
H. Pao
33
0
0
24 May 2024
Stacking Your Transformers: A Closer Look at Model Growth for Efficient
  LLM Pre-Training
Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training
Wenyu Du
Tongxu Luo
Zihan Qiu
Zeyu Huang
Yikang Shen
Reynold Cheng
Yike Guo
Jie Fu
34
10
0
24 May 2024
Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large
  Language Models in Code Generation from Scientific Plots
Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots
Chengyue Wu
Yixiao Ge
Qiushan Guo
Jiahao Wang
Zhixuan Liang
Zeyu Lu
Ying Shan
Ping Luo
MLLM
VLM
27
19
0
13 May 2024
CityLLaVA: Efficient Fine-Tuning for VLMs in City Scenario
CityLLaVA: Efficient Fine-Tuning for VLMs in City Scenario
Zhizhao Duan
Hao Cheng
Duo Xu
Xi Wu
Xiangxie Zhang
Xi Ye
Zhen Xie
24
6
0
06 May 2024
HFT: Half Fine-Tuning for Large Language Models
HFT: Half Fine-Tuning for Large Language Models
Tingfeng Hui
Zhenyu Zhang
Shuohuan Wang
Weiran Xu
Yu Sun
Hua-Hong Wu
CLL
37
4
0
29 Apr 2024
Towards Incremental Learning in Large Language Models: A Critical Review
Towards Incremental Learning in Large Language Models: A Critical Review
M. Jovanovic
Peter Voss
ELM
CLL
KELM
26
5
0
28 Apr 2024
When Life gives you LLMs, make LLM-ADE: Large Language Models with
  Adaptive Data Engineering
When Life gives you LLMs, make LLM-ADE: Large Language Models with Adaptive Data Engineering
Stephen Choi
William Gazeley
KELM
21
2
0
19 Apr 2024
Arcee's MergeKit: A Toolkit for Merging Large Language Models
Arcee's MergeKit: A Toolkit for Merging Large Language Models
Charles Goddard
Shamane Siriwardhana
Malikeh Ehghaghi
Luke Meyers
Vladimir Karpukhin
Brian Benedict
Mark McQuade
Jacob Solawetz
MoMe
KELM
75
75
0
20 Mar 2024
Simple and Scalable Strategies to Continually Pre-train Large Language
  Models
Simple and Scalable Strategies to Continually Pre-train Large Language Models
Adam Ibrahim
Benjamin Thérien
Kshitij Gupta
Mats L. Richter
Quentin Anthony
Timothée Lesort
Eugene Belilovsky
Irina Rish
KELM
CLL
44
50
0
13 Mar 2024
Both Matter: Enhancing the Emotional Intelligence of Large Language
  Models without Compromising the General Intelligence
Both Matter: Enhancing the Emotional Intelligence of Large Language Models without Compromising the General Intelligence
Weixiang Zhao
Zhuojun Li
Shilong Wang
Yang Wang
Yulin Hu
Yanyan Zhao
Chen Wei
Bing Qin
12
4
0
15 Feb 2024
Large Language Models: A Survey
Large Language Models: A Survey
Shervin Minaee
Tomáš Mikolov
Narjes Nikzad
M. Asgari-Chenaghlu
R. Socher
Xavier Amatriain
Jianfeng Gao
ALM
LM&MA
ELM
115
353
0
09 Feb 2024
MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language
  Feedback
MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback
Xingyao Wang
Zihan Wang
Jiateng Liu
Yangyi Chen
Lifan Yuan
Hao Peng
Heng Ji
LRM
125
137
0
19 Sep 2023
$π$-Tuning: Transferring Multimodal Foundation Models with Optimal
  Multi-task Interpolation
πππ-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation
Chengyue Wu
Teng Wang
Yixiao Ge
Zeyu Lu
Rui-Zhi Zhou
Ying Shan
Ping Luo
MoMe
80
35
0
27 Apr 2023
Fine-tuned Language Models are Continual Learners
Fine-tuned Language Models are Continual Learners
Thomas Scialom
Tuhin Chakrabarty
Smaranda Muresan
CLL
LRM
134
116
0
24 May 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
303
11,730
0
04 Mar 2022
12
Next