ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.14679
  4. Cited By
Compact Language Models via Pruning and Knowledge Distillation

Compact Language Models via Pruning and Knowledge Distillation

19 July 2024
Saurav Muralidharan
Sharath Turuvekere Sreenivas
Raviraj Joshi
Marcin Chochowski
M. Patwary
Mohammad Shoeybi
Bryan Catanzaro
Jan Kautz
Pavlo Molchanov
    SyDaMQ
ArXiv (abs)PDFHTMLHuggingFace (40 upvotes)

Papers citing "Compact Language Models via Pruning and Knowledge Distillation"

13 / 63 papers shown
Title
GPT for Games: An Updated Scoping Review (2020-2024)
GPT for Games: An Updated Scoping Review (2020-2024)IEEE Transactions on Games (IEEE Trans. Games), 2024
Daijin Yang
Erica Kleinman
Casper Harteveld
LLMAGAI4TSAI4CE
420
14
0
01 Nov 2024
BitStack: Any-Size Compression of Large Language Models in Variable Memory Environments
BitStack: Any-Size Compression of Large Language Models in Variable Memory Environments
Xinghao Wang
Pengyu Wang
Bo Wang
Dong Zhang
Yunhua Zhou
Jiaqi Leng
MQ
351
1
0
31 Oct 2024
Computational Bottlenecks of Training Small-scale Large Language Models
Computational Bottlenecks of Training Small-scale Large Language Models
Saleh Ashkboos
Iman Mirzadeh
Keivan Alizadeh
Mohammad Hossein Sekhavat
Moin Nabi
Mehrdad Farajtabar
Fartash Faghri
128
4
0
25 Oct 2024
MiniPLM: Knowledge Distillation for Pre-Training Language Models
MiniPLM: Knowledge Distillation for Pre-Training Language ModelsInternational Conference on Learning Representations (ICLR), 2024
Yuxian Gu
Hao Zhou
Fandong Meng
Jie Zhou
Shiyu Huang
402
14
0
22 Oct 2024
MoE-Pruner: Pruning Mixture-of-Experts Large Language Model using the
  Hints from Its Router
MoE-Pruner: Pruning Mixture-of-Experts Large Language Model using the Hints from Its Router
Yanyue Xie
Zhi Zhang
Ding Zhou
Cong Xie
Ziang Song
Xin Liu
Yanzhi Wang
Xue Lin
An Xu
LLMAG
190
24
0
15 Oct 2024
BlackDAN: A Black-Box Multi-Objective Approach for Effective and
  Contextual Jailbreaking of Large Language Models
BlackDAN: A Black-Box Multi-Objective Approach for Effective and Contextual Jailbreaking of Large Language Models
Xinyuan Wang
Victor Shea-Jay Huang
Renmiao Chen
Hao Wang
Changzai Pan
Lei Sha
Shiyu Huang
AAML
204
4
0
13 Oct 2024
Compressing Large Language Models with Automated Sub-Network Search
Compressing Large Language Models with Automated Sub-Network Search
R. Sukthanker
B. Staffler
Katharina Eggensperger
Aaron Klein
LRM
248
0
0
09 Oct 2024
Leveraging Large Language Models for Suicide Detection on Social Media
  with Limited Labels
Leveraging Large Language Models for Suicide Detection on Social Media with Limited LabelsBigData Congress [Services Society] (BSS), 2024
Vy Nguyen
Chau Pham
ALMAI4MH
396
8
0
06 Oct 2024
Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal
  Sampling
Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal SamplingInternational Conference on Learning Representations (ICLR), 2024
Hritik Bansal
Arian Hosseini
Rishabh Agarwal
Vinh Q. Tran
Mehran Kazemi
SyDaOffRLLRM
213
62
0
29 Aug 2024
Cross-Domain Foundation Model Adaptation: Pioneering Computer Vision
  Models for Geophysical Data Analysis
Cross-Domain Foundation Model Adaptation: Pioneering Computer Vision Models for Geophysical Data AnalysisJournal of Geophysical Research (JGR), 2024
Zhixiang Guo
Xinming Wu
Luming Liang
Hanlin Sheng
Nuo Chen
Zhengfa Bi
AI4CE
213
9
0
22 Aug 2024
LLM Pruning and Distillation in Practice: The Minitron Approach
LLM Pruning and Distillation in Practice: The Minitron Approach
Sharath Turuvekere Sreenivas
Saurav Muralidharan
Raviraj Joshi
Marcin Chochowski
M. Patwary
Mohammad Shoeybi
Bryan Catanzaro
Jan Kautz
Pavlo Molchanov
222
62
0
21 Aug 2024
The Unreasonable Ineffectiveness of the Deeper Layers
The Unreasonable Ineffectiveness of the Deeper Layers
Andrey Gromov
Kushal Tirumala
Hassan Shapourian
Paolo Glorioso
Daniel A. Roberts
380
152
0
26 Mar 2024
Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes
Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes
Lucio Dery
Steven Kolawole
Jean-Francois Kagey
Virginia Smith
Graham Neubig
Ameet Talwalkar
224
46
0
08 Feb 2024
Previous
12