ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1802.05668
  4. Cited By
Model compression via distillation and quantization

Model compression via distillation and quantization

15 February 2018
A. Polino
Razvan Pascanu
Dan Alistarh
    MQ
ArXivPDFHTML

Papers citing "Model compression via distillation and quantization"

50 / 133 papers shown
Title
Optimizing LLMs for Resource-Constrained Environments: A Survey of Model Compression Techniques
Optimizing LLMs for Resource-Constrained Environments: A Survey of Model Compression Techniques
Sanjay Surendranath Girija
Shashank Kapoor
Lakshit Arora
Dipen Pradhan
Aman Raj
Ankit Shetgaonkar
54
0
0
05 May 2025
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding
Jinlong Li
Cristiano Saltori
Fabio Poiesi
N. Sebe
165
0
0
20 Mar 2025
Towards Understanding Distilled Reasoning Models: A Representational Approach
Towards Understanding Distilled Reasoning Models: A Representational Approach
David D. Baek
Max Tegmark
LRM
80
3
0
05 Mar 2025
Mixture of Attentions For Speculative Decoding
Mixture of Attentions For Speculative Decoding
Matthieu Zimmer
Milan Gritta
Gerasimos Lampouras
Haitham Bou Ammar
Jun Wang
76
4
0
04 Oct 2024
Online-Score-Aided Federated Learning: Taming the Resource Constraints in Wireless Networks
Online-Score-Aided Federated Learning: Taming the Resource Constraints in Wireless Networks
Md Ferdous Pervej
Minseok Choi
A. Molisch
33
0
0
12 Aug 2024
Training Foundation Models as Data Compression: On Information, Model Weights and Copyright Law
Training Foundation Models as Data Compression: On Information, Model Weights and Copyright Law
Giorgio Franceschelli
Claudia Cevenini
Mirco Musolesi
44
0
0
18 Jul 2024
Relational Representation Distillation
Relational Representation Distillation
Nikolaos Giakoumoglou
Tania Stathaki
34
0
0
16 Jul 2024
HDKD: Hybrid Data-Efficient Knowledge Distillation Network for Medical Image Classification
HDKD: Hybrid Data-Efficient Knowledge Distillation Network for Medical Image Classification
Omar S. El-Assiouti
Ghada Hamed
Dina Khattab
H. M. Ebied
37
1
0
10 Jul 2024
LLAMAFUZZ: Large Language Model Enhanced Greybox Fuzzing
LLAMAFUZZ: Large Language Model Enhanced Greybox Fuzzing
Hongxiang Zhang
Yuyang Rong
Yifeng He
Hao Chen
23
7
0
11 Jun 2024
Fast Vocabulary Transfer for Language Model Compression
Fast Vocabulary Transfer for Language Model Compression
Leonidas Gee
Andrea Zugarini
Leonardo Rigutini
Paolo Torroni
35
26
0
15 Feb 2024
SwapNet: Efficient Swapping for DNN Inference on Edge AI Devices Beyond
  the Memory Budget
SwapNet: Efficient Swapping for DNN Inference on Edge AI Devices Beyond the Memory Budget
Kun Wang
Jiani Cao
Zimu Zhou
Zhenjiang Li
22
5
0
30 Jan 2024
Pursing the Sparse Limitation of Spiking Deep Learning Structures
Pursing the Sparse Limitation of Spiking Deep Learning Structures
Hao-Ran Cheng
Jiahang Cao
Erjia Xiao
Mengshu Sun
Le Yang
Jize Zhang
Xue Lin
B. Kailkhura
Kaidi Xu
Renjing Xu
16
1
0
18 Nov 2023
The Road to On-board Change Detection: A Lightweight Patch-Level Change
  Detection Network via Exploring the Potential of Pruning and Pooling
The Road to On-board Change Detection: A Lightweight Patch-Level Change Detection Network via Exploring the Potential of Pruning and Pooling
Lihui Xue
Zhihao Wang
Xueqian Wang
Gang Li
35
1
0
16 Oct 2023
Soft Quantization using Entropic Regularization
Soft Quantization using Entropic Regularization
Rajmadan Lakshmanan
Alois Pichler
MQ
13
5
0
08 Sep 2023
eDKM: An Efficient and Accurate Train-time Weight Clustering for Large
  Language Models
eDKM: An Efficient and Accurate Train-time Weight Clustering for Large Language Models
Minsik Cho
Keivan Alizadeh Vahid
Qichen Fu
Saurabh N. Adya
C. C. D. Mundo
Mohammad Rastegari
Devang Naik
Peter Zatloukal
MQ
21
6
0
02 Sep 2023
Quantized Feature Distillation for Network Quantization
Quantized Feature Distillation for Network Quantization
Kevin Zhu
Yin He
Jianxin Wu
MQ
29
9
0
20 Jul 2023
Self-Distilled Quantization: Achieving High Compression Rates in
  Transformer-Based Language Models
Self-Distilled Quantization: Achieving High Compression Rates in Transformer-Based Language Models
James OÑeill
Sourav Dutta
VLM
MQ
34
1
0
12 Jul 2023
InfLoR-SNN: Reducing Information Loss for Spiking Neural Networks
InfLoR-SNN: Reducing Information Loss for Spiking Neural Networks
Yu-Zhu Guo
Y. Chen
Liwen Zhang
Xiaode Liu
Xinyi Tong
Yuanyuan Ou
Xuhui Huang
Zhe Ma
AAML
39
3
0
10 Jul 2023
A Review on Explainable Artificial Intelligence for Healthcare: Why,
  How, and When?
A Review on Explainable Artificial Intelligence for Healthcare: Why, How, and When?
M. Rubaiyat
Hossain Mondal
Prajoy Podder
20
56
0
10 Apr 2023
Performance-aware Approximation of Global Channel Pruning for Multitask
  CNNs
Performance-aware Approximation of Global Channel Pruning for Multitask CNNs
Hancheng Ye
Bo-Wen Zhang
Tao Chen
Jiayuan Fan
Bin Wang
29
18
0
21 Mar 2023
LightTS: Lightweight Time Series Classification with Adaptive Ensemble
  Distillation -- Extended Version
LightTS: Lightweight Time Series Classification with Adaptive Ensemble Distillation -- Extended Version
David Campos
Miao Zhang
B. Yang
Tung Kieu
Chenjuan Guo
Christian S. Jensen
AI4TS
45
47
0
24 Feb 2023
RedBit: An End-to-End Flexible Framework for Evaluating the Accuracy of
  Quantized CNNs
RedBit: An End-to-End Flexible Framework for Evaluating the Accuracy of Quantized CNNs
A. M. Ribeiro-dos-Santos
João Dinis Ferreira
O. Mutlu
G. Falcão
MQ
15
1
0
15 Jan 2023
Pruning Compact ConvNets for Efficient Inference
Pruning Compact ConvNets for Efficient Inference
Sayan Ghosh
Karthik Prasad
Xiaoliang Dai
Peizhao Zhang
Bichen Wu
Graham Cormode
Peter Vajda
VLM
19
4
0
11 Jan 2023
Systems for Parallel and Distributed Large-Model Deep Learning Training
Systems for Parallel and Distributed Large-Model Deep Learning Training
Kabir Nagrecha
GNN
VLM
MoE
26
7
0
06 Jan 2023
PD-Quant: Post-Training Quantization based on Prediction Difference
  Metric
PD-Quant: Post-Training Quantization based on Prediction Difference Metric
Jiawei Liu
Lin Niu
Zhihang Yuan
Dawei Yang
Xinggang Wang
Wenyu Liu
MQ
96
68
0
14 Dec 2022
CSQ: Growing Mixed-Precision Quantization Scheme with Bi-level
  Continuous Sparsification
CSQ: Growing Mixed-Precision Quantization Scheme with Bi-level Continuous Sparsification
Lirui Xiao
Huanrui Yang
Zhen Dong
Kurt Keutzer
Li Du
Shanghang Zhang
MQ
27
10
0
06 Dec 2022
QFT: Post-training quantization via fast joint finetuning of all degrees
  of freedom
QFT: Post-training quantization via fast joint finetuning of all degrees of freedom
Alexander Finkelstein
Ella Fuchs
Idan Tal
Mark Grobman
Niv Vosco
Eldad Meller
MQ
24
6
0
05 Dec 2022
NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization
  for Vision Transformers
NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers
Yijiang Liu
Huanrui Yang
Zhen Dong
Kurt Keutzer
Li Du
Shanghang Zhang
MQ
31
45
0
29 Nov 2022
AskewSGD : An Annealed interval-constrained Optimisation method to train
  Quantized Neural Networks
AskewSGD : An Annealed interval-constrained Optimisation method to train Quantized Neural Networks
Louis Leconte
S. Schechtman
Eric Moulines
29
4
0
07 Nov 2022
Fast and Low-Memory Deep Neural Networks Using Binary Matrix
  Factorization
Fast and Low-Memory Deep Neural Networks Using Binary Matrix Factorization
Alireza Bordbar
M. Kahaei
MQ
25
0
0
24 Oct 2022
Towards Global Neural Network Abstractions with Locally-Exact
  Reconstruction
Towards Global Neural Network Abstractions with Locally-Exact Reconstruction
Edoardo Manino
I. Bessa
Lucas C. Cordeiro
21
1
0
21 Oct 2022
Mixed-Precision Neural Networks: A Survey
Mixed-Precision Neural Networks: A Survey
M. Rakka
M. Fouda
Pramod P. Khargonekar
Fadi J. Kurdahi
MQ
21
11
0
11 Aug 2022
Distributed Training for Deep Learning Models On An Edge Computing
  Network Using ShieldedReinforcement Learning
Distributed Training for Deep Learning Models On An Edge Computing Network Using ShieldedReinforcement Learning
Tanmoy Sen
Haiying Shen
OffRL
11
5
0
01 Jun 2022
Dataset Distillation using Neural Feature Regression
Dataset Distillation using Neural Feature Regression
Yongchao Zhou
E. Nezhadarya
Jimmy Ba
DD
FedML
39
149
0
01 Jun 2022
Target Aware Network Architecture Search and Compression for Efficient
  Knowledge Transfer
Target Aware Network Architecture Search and Compression for Efficient Knowledge Transfer
S. H. Shabbeer Basha
Debapriya Tula
Sravan Kumar Vinakota
S. Dubey
26
2
0
12 May 2022
Serving and Optimizing Machine Learning Workflows on Heterogeneous
  Infrastructures
Serving and Optimizing Machine Learning Workflows on Heterogeneous Infrastructures
Yongji Wu
Matthew Lentz
Danyang Zhuo
Yao Lu
21
22
0
10 May 2022
Compact Model Training by Low-Rank Projection with Energy Transfer
Compact Model Training by Low-Rank Projection with Energy Transfer
K. Guo
Zhenquan Lin
Xiaofen Xing
Fang Liu
Xiangmin Xu
22
2
0
12 Apr 2022
EfficientFi: Towards Large-Scale Lightweight WiFi Sensing via CSI
  Compression
EfficientFi: Towards Large-Scale Lightweight WiFi Sensing via CSI Compression
Jianfei Yang
Xinyan Chen
Han Zou
Dazhuo Wang
Q. Xu
Lihua Xie
14
78
0
08 Apr 2022
Bimodal Distributed Binarized Neural Networks
Bimodal Distributed Binarized Neural Networks
T. Rozen
Moshe Kimhi
Brian Chmiel
A. Mendelson
Chaim Baskin
MQ
36
4
0
05 Apr 2022
FedSynth: Gradient Compression via Synthetic Data in Federated Learning
FedSynth: Gradient Compression via Synthetic Data in Federated Learning
Shengyuan Hu
Jack Goetz
Kshitiz Malik
Hongyuan Zhan
Zhe Liu
Yue Liu
DD
FedML
34
38
0
04 Apr 2022
Update Compression for Deep Neural Networks on the Edge
Update Compression for Deep Neural Networks on the Edge
Bo Chen
A. Bakhshi
Gustavo E. A. P. A. Batista
Brian Ng
Tat-Jun Chin
24
17
0
09 Mar 2022
Deadwooding: Robust Global Pruning for Deep Neural Networks
Deadwooding: Robust Global Pruning for Deep Neural Networks
Sawinder Kaur
Ferdinando Fioretto
Asif Salekin
19
4
0
10 Feb 2022
Robust Binary Models by Pruning Randomly-initialized Networks
Robust Binary Models by Pruning Randomly-initialized Networks
Chen Liu
Ziqi Zhao
Sabine Süsstrunk
Mathieu Salzmann
TPM
AAML
MQ
19
4
0
03 Feb 2022
Iterative Activation-based Structured Pruning
Iterative Activation-based Structured Pruning
Kaiqi Zhao
Animesh Jain
Ming Zhao
39
0
0
22 Jan 2022
Enabling Deep Learning on Edge Devices through Filter Pruning and
  Knowledge Transfer
Enabling Deep Learning on Edge Devices through Filter Pruning and Knowledge Transfer
Kaiqi Zhao
Yitao Chen
Ming Zhao
25
3
0
22 Jan 2022
Ensemble Transformer for Efficient and Accurate Ranking Tasks: an
  Application to Question Answering Systems
Ensemble Transformer for Efficient and Accurate Ranking Tasks: an Application to Question Answering Systems
Yoshitomo Matsubara
Luca Soldaini
Eric Lind
Alessandro Moschitti
21
6
0
15 Jan 2022
Problem-dependent attention and effort in neural networks with
  applications to image resolution and model selection
Problem-dependent attention and effort in neural networks with applications to image resolution and model selection
Chris Rohlfs
16
4
0
05 Jan 2022
Learning Robust and Lightweight Model through Separable Structured
  Transformations
Learning Robust and Lightweight Model through Separable Structured Transformations
Xian Wei
Yanhui Huang
Yang Xu
Mingsong Chen
Hai Lan
Yuanxiang Li
Zhongfeng Wang
Xuan Tang
OOD
24
0
0
27 Dec 2021
Illumination and Temperature-Aware Multispectral Networks for
  Edge-Computing-Enabled Pedestrian Detection
Illumination and Temperature-Aware Multispectral Networks for Edge-Computing-Enabled Pedestrian Detection
Yifan Zhuang
Ziyuan Pu
Jia Hu
Yinhai Wang
25
24
0
09 Dec 2021
Automatic Neural Network Pruning that Efficiently Preserves the Model
  Accuracy
Automatic Neural Network Pruning that Efficiently Preserves the Model Accuracy
Thibault Castells
Seul-Ki Yeom
3DV
18
3
0
18 Nov 2021
123
Next