Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1612.01064
Cited By
Trained Ternary Quantization
4 December 2016
Chenzhuo Zhu
Song Han
Huizi Mao
W. Dally
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Trained Ternary Quantization"
50 / 509 papers shown
Title
Data Quality-aware Mixed-precision Quantization via Hybrid Reinforcement Learning
Yingchun Wang
Jingcai Guo
Song Guo
Weizhan Zhang
MQ
29
20
0
09 Feb 2023
Learning Discretized Neural Networks under Ricci Flow
Jun Chen
Han Chen
Mengmeng Wang
Guang Dai
Ivor W. Tsang
Y. Liu
15
2
0
07 Feb 2023
Efficient and Effective Methods for Mixed Precision Neural Network Quantization for Faster, Energy-efficient Inference
Deepika Bablani
J. McKinstry
S. K. Esser
R. Appuswamy
D. Modha
MQ
20
4
0
30 Jan 2023
RedBit: An End-to-End Flexible Framework for Evaluating the Accuracy of Quantized CNNs
A. M. Ribeiro-dos-Santos
João Dinis Ferreira
O. Mutlu
G. Falcão
MQ
13
1
0
15 Jan 2023
Holistic Network Virtualization and Pervasive Network Intelligence for 6G
Xuemin Shen
Shen
Jie Gao
Wen Wu
Mushu Li
Conghao Zhou
W. Zhuang
24
233
0
02 Jan 2023
Hyperspherical Quantization: Toward Smaller and More Accurate Models
Dan Liu
X. Chen
Chen-li Ma
Xue Liu
MQ
22
3
0
24 Dec 2022
Hyperspherical Loss-Aware Ternary Quantization
Dan Liu
Xue Liu
MQ
19
0
0
24 Dec 2022
Masked Wavelet Representation for Compact Neural Radiance Fields
Daniel Rho
Byeonghyeon Lee
Seungtae Nam
J. Lee
J. Ko
Eunbyung Park
36
52
0
18 Dec 2022
Towards Hardware-Specific Automatic Compression of Neural Networks
Torben Krieger
Bernhard Klein
Holger Fröning
MQ
19
2
0
15 Dec 2022
Vertical Layering of Quantized Neural Networks for Heterogeneous Inference
Hai Wu
Ruifei He
Hao Hao Tan
Xiaojuan Qi
Kaibin Huang
MQ
19
2
0
10 Dec 2022
BiFSMNv2: Pushing Binary Neural Networks for Keyword Spotting to Real-Network Performance
Haotong Qin
Xudong Ma
Yifu Ding
X. Li
Yang Zhang
Zejun Ma
Jiakai Wang
Jie Luo
Xianglong Liu
MQ
38
20
0
13 Nov 2022
AskewSGD : An Annealed interval-constrained Optimisation method to train Quantized Neural Networks
Louis Leconte
S. Schechtman
Eric Moulines
27
4
0
07 Nov 2022
Collaborative Multi-Teacher Knowledge Distillation for Learning Low Bit-width Deep Neural Networks
Cuong Pham
Tuan Hoang
Thanh-Toan Do
FedML
MQ
21
14
0
27 Oct 2022
Q-ViT: Accurate and Fully Quantized Low-bit Vision Transformer
Yanjing Li
Sheng Xu
Baochang Zhang
Xianbin Cao
Penglei Gao
Guodong Guo
MQ
ViT
26
89
0
13 Oct 2022
Structural Pruning via Latency-Saliency Knapsack
Maying Shen
Hongxu Yin
Pavlo Molchanov
Lei Mao
Jianna Liu
J. Álvarez
37
47
0
13 Oct 2022
SeKron: A Decomposition Method Supporting Many Factorization Structures
Marawan Gamal Abdel Hameed
A. Mosleh
Marzieh S. Tahaei
V. Nia
21
1
0
12 Oct 2022
Seeking Interpretability and Explainability in Binary Activated Neural Networks
Benjamin Leblanc
Pascal Germain
FAtt
29
1
0
07 Sep 2022
DenseShift: Towards Accurate and Efficient Low-Bit Power-of-Two Quantization
Xinlin Li
Bangya Liu
Ruizhi Yang
Vanessa Courville
Chao Xing
V. Nia
MQ
32
2
0
20 Aug 2022
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
Tim Dettmers
M. Lewis
Younes Belkada
Luke Zettlemoyer
MQ
29
625
0
15 Aug 2022
Mixed-Precision Neural Networks: A Survey
M. Rakka
M. Fouda
Pramod P. Khargonekar
Fadi J. Kurdahi
MQ
18
11
0
11 Aug 2022
Model Blending for Text Classification
Ramit Pahwa
18
0
0
05 Aug 2022
PalQuant: Accelerating High-precision Networks on Low-precision Accelerators
Qinghao Hu
Gang Li
Qiman Wu
Jian Cheng
MQ
18
2
0
03 Aug 2022
CoNLoCNN: Exploiting Correlation and Non-Uniform Quantization for Energy-Efficient Low-precision Deep Convolutional Neural Networks
Muhammad Abdullah Hanif
G. M. Sarda
Alberto Marchisio
Guido Masera
Maurizio Martina
Muhammad Shafique
MQ
8
4
0
31 Jul 2022
Quantized Sparse Weight Decomposition for Neural Network Compression
Andrey Kuzmin
M. V. Baalen
Markus Nagel
Arash Behboodi
MQ
6
3
0
22 Jul 2022
Communication Acceleration of Local Gradient Methods via an Accelerated Primal-Dual Algorithm with Inexact Prox
Abdurakhmon Sadiev
D. Kovalev
Peter Richtárik
17
20
0
08 Jul 2022
Compilation and Optimizations for Efficient Machine Learning on Embedded Systems
Xiaofan Zhang
Yao Chen
Cong Hao
Sitao Huang
Yuhong Li
Deming Chen
27
1
0
06 Jun 2022
DepthShrinker: A New Compression Paradigm Towards Boosting Real-Hardware Efficiency of Compact Neural Networks
Y. Fu
Haichuan Yang
Jiayi Yuan
Meng Li
Cheng Wan
Raghuraman Krishnamoorthi
Vikas Chandra
Yingyan Lin
28
18
0
02 Jun 2022
Gator: Customizable Channel Pruning of Neural Networks with Gating
E. Passov
E. David
N. Netanyahu
AAML
34
0
0
30 May 2022
A Comprehensive Survey on Model Quantization for Deep Neural Networks in Image Classification
Babak Rokh
A. Azarpeyvand
Alireza Khanteymoori
MQ
30
82
0
14 May 2022
Revisiting Random Channel Pruning for Neural Network Compression
Yawei Li
Kamil Adamczewski
Wen Li
Shuhang Gu
Radu Timofte
Luc Van Gool
21
81
0
11 May 2022
Lite Pose: Efficient Architecture Design for 2D Human Pose Estimation
Yihan Wang
Muyang Li
Han Cai
Wei-Ming Chen
Song Han
3DH
16
71
0
03 May 2022
Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications
Han Cai
Ji Lin
Yujun Lin
Zhijian Liu
Haotian Tang
Hanrui Wang
Ligeng Zhu
Song Han
21
107
0
25 Apr 2022
HCFL: A High Compression Approach for Communication-Efficient Federated Learning in Very Large Scale IoT Networks
Minh-Duong Nguyen
Sangmin Lee
Viet Quoc Pham
D. Hoang
Diep N. Nguyen
W. Hwang
12
28
0
14 Apr 2022
LilNetX: Lightweight Networks with EXtreme Model Compression and Structured Sparsification
Sharath Girish
Kamal Gupta
Saurabh Singh
Abhinav Shrivastava
28
11
0
06 Apr 2022
Soft Threshold Ternary Networks
Weixiang Xu
Xiangyu He
Tianli Zhao
Qinghao Hu
Peisong Wang
Jian Cheng
MQ
14
7
0
04 Apr 2022
FxP-QNet: A Post-Training Quantizer for the Design of Mixed Low-Precision DNNs with Dynamic Fixed-Point Representation
Ahmad Shawahna
S. M. Sait
A. El-Maleh
Irfan Ahmad
MQ
18
6
0
22 Mar 2022
Learning Compressed Embeddings for On-Device Inference
Niketan Pansare
J. Katukuri
Aditya Arora
F. Cipollone
R. Shaik
Noyan Tokgozoglu
Chandru Venkataraman
24
14
0
18 Mar 2022
Hardware Approximate Techniques for Deep Neural Network Accelerators: A Survey
Giorgos Armeniakos
Georgios Zervakis
Dimitrios Soudris
J. Henkel
196
93
0
16 Mar 2022
YONO: Modeling Multiple Heterogeneous Neural Networks on Microcontrollers
Young D. Kwon
Jagmohan Chauhan
Cecilia Mascolo
16
13
0
08 Mar 2022
Distilled Neural Networks for Efficient Learning to Rank
F. M. Nardini
Cosimo Rulli
Salvatore Trani
Rossano Venturini
FedML
24
16
0
22 Feb 2022
Bit-wise Training of Neural Network Weights
Cristian Ivan
MQ
16
0
0
19 Feb 2022
Vau da muntanialas: Energy-efficient multi-die scalable acceleration of RNN inference
G. Paulin
Francesco Conti
Lukas Cavigelli
Luca Benini
22
8
0
14 Feb 2022
F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization
Qing Jin
Jian Ren
Richard Zhuang
Sumant Hanumante
Zhengang Li
Zhiyu Chen
Yanzhi Wang
Kai-Min Yang
Sergey Tulyakov
MQ
24
48
0
10 Feb 2022
Lightweight Jet Reconstruction and Identification as an Object Detection Task
Adrian Alan Pol
T. Aarrestad
E. Govorkova
Roi Halily
Anat Klempner
...
Vladimir Loncar
J. Ngadiuba
M. Pierini
Olya Sirkin
S. Summers
17
2
0
09 Feb 2022
FAT: An In-Memory Accelerator with Fast Addition for Ternary Weight Neural Networks
Shien Zhu
Luan H. K. Duong
Hui Chen
Di Liu
Weichen Liu
MQ
14
5
0
19 Jan 2022
PocketNN: Integer-only Training and Inference of Neural Networks via Direct Feedback Alignment and Pocket Activations in Pure C++
Jae-Su Song
Fangzhen Lin
MQ
7
7
0
08 Jan 2022
Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks
Runpei Dong
Zhanhong Tan
Mengdi Wu
Linfeng Zhang
Kaisheng Ma
MQ
33
11
0
30 Dec 2021
Resource-Efficient Deep Learning: A Survey on Model-, Arithmetic-, and Implementation-Level Techniques
JunKyu Lee
L. Mukhanov
A. S. Molahosseini
U. Minhas
Yang Hua
Jesus Martinez del Rincon
K. Dichev
Cheol-Ho Hong
Hans Vandierendonck
33
29
0
30 Dec 2021
BMPQ: Bit-Gradient Sensitivity Driven Mixed-Precision Quantization of DNNs from Scratch
Souvik Kundu
Shikai Wang
Qirui Sun
P. Beerel
Massoud Pedram
MQ
13
18
0
24 Dec 2021
Elastic-Link for Binarized Neural Network
Jie Hu
Ziheng Wu
Vince Tan
Zhilin Lu
Mengze Zeng
Enhua Wu
MQ
26
6
0
19 Dec 2021
Previous
1
2
3
4
5
...
9
10
11
Next