ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1806.08342
  4. Cited By
Quantizing deep convolutional networks for efficient inference: A
  whitepaper

Quantizing deep convolutional networks for efficient inference: A whitepaper

21 June 2018
Raghuraman Krishnamoorthi
    MQ
ArXivPDFHTML

Papers citing "Quantizing deep convolutional networks for efficient inference: A whitepaper"

50 / 464 papers shown
Title
Quantized Approximate Signal Processing (QASP): Towards Homomorphic Encryption for audio
Quantized Approximate Signal Processing (QASP): Towards Homomorphic Encryption for audio
Tu Duyen Nguyen
Adrien Lesage
Clotilde Cantini
Rachid Riad
21
0
0
15 May 2025
Self-Supervised Event Representations: Towards Accurate, Real-Time Perception on SoC FPGAs
Self-Supervised Event Representations: Towards Accurate, Real-Time Perception on SoC FPGAs
K. Jeziorek
T. Kryjak
AI4TS
39
0
0
12 May 2025
Optimizing LLMs for Resource-Constrained Environments: A Survey of Model Compression Techniques
Optimizing LLMs for Resource-Constrained Environments: A Survey of Model Compression Techniques
Sanjay Surendranath Girija
Shashank Kapoor
Lakshit Arora
Dipen Pradhan
Aman Raj
Ankit Shetgaonkar
54
0
0
05 May 2025
StableQuant: Layer Adaptive Post-Training Quantization for Speech Foundation Models
StableQuant: Layer Adaptive Post-Training Quantization for Speech Foundation Models
Yeona Hong
Hyewon Han
Woo-Jin Chung
Hong-Goo Kang
MQ
28
0
0
21 Apr 2025
Cycle Training with Semi-Supervised Domain Adaptation: Bridging Accuracy and Efficiency for Real-Time Mobile Scene Detection
Cycle Training with Semi-Supervised Domain Adaptation: Bridging Accuracy and Efficiency for Real-Time Mobile Scene Detection
Huu-Phong Phan-Nguyen
Anh Dao
T. Nguyen
Tuan Quang
H. Tran
Tinh-Anh Nguyen-Nhu
Huy-Thach Pham
Quan Nguyen
Hoang M. Le
Quang-Vinh Dinh
30
0
0
12 Apr 2025
Efficient FPGA-accelerated Convolutional Neural Networks for Cloud Detection on CubeSats
Efficient FPGA-accelerated Convolutional Neural Networks for Cloud Detection on CubeSats
Angela Cratere
M. Salim Farissi
Andrea Carbone
Marcello Asciolla
Maria Rizzi
Francesco DellÓlio
Augusto Nascetti
Dario Spiller
28
1
0
04 Apr 2025
APHQ-ViT: Post-Training Quantization with Average Perturbation Hessian Based Reconstruction for Vision Transformers
APHQ-ViT: Post-Training Quantization with Average Perturbation Hessian Based Reconstruction for Vision Transformers
Zhuguanyu Wu
Jiayi Zhang
Jiaxin Chen
Jinyang Guo
Di Huang
Yunhong Wang
MQ
47
1
0
03 Apr 2025
QSViT: A Methodology for Quantizing Spiking Vision Transformers
QSViT: A Methodology for Quantizing Spiking Vision Transformers
Rachmad Vidya Wicaksana Putra
Saad Iftikhar
Muhammad Shafique
MQ
44
0
0
01 Apr 2025
Real-Time Navigation for Autonomous Aerial Vehicles Using Video
Real-Time Navigation for Autonomous Aerial Vehicles Using Video
Khizar Anjum
Parul Pandey
Vidyasagar Sadhu
Roberto Tron
D. Pompili
41
0
0
01 Apr 2025
A 71.2-$μ$W Speech Recognition Accelerator with Recurrent Spiking Neural Network
A 71.2-μμμW Speech Recognition Accelerator with Recurrent Spiking Neural Network
Chih-Chyau Yang
Tian-Sheuan Chang
60
1
0
27 Mar 2025
Efficient Personalization of Quantized Diffusion Model without Backpropagation
Efficient Personalization of Quantized Diffusion Model without Backpropagation
H. Seo
Wongi Jeong
Kyungryeol Lee
Se Young Chun
DiffM
MQ
78
0
0
19 Mar 2025
FP4DiT: Towards Effective Floating Point Quantization for Diffusion Transformers
FP4DiT: Towards Effective Floating Point Quantization for Diffusion Transformers
Ruichen Chen
Keith G. Mills
Di Niu
MQ
54
0
0
19 Mar 2025
Knowledge Distillation: Enhancing Neural Network Compression with Integrated Gradients
Knowledge Distillation: Enhancing Neural Network Compression with Integrated Gradients
David E. Hernandez
J. Chang
Torbjörn E. M. Nordling
58
0
0
17 Mar 2025
Real-Time Multi-Object Tracking using YOLOv8 and SORT on a SoC FPGA
Real-Time Multi-Object Tracking using YOLOv8 and SORT on a SoC FPGA
Michal Danilowicz
T. Kryjak
VOT
58
0
0
17 Mar 2025
ViM-VQ: Efficient Post-Training Vector Quantization for Visual Mamba
Juncan Deng
Shuaiting Li
Zeyu Wang
Kedong Xu
Hong Gu
Kejie Huang
MQ
60
0
0
12 Mar 2025
Accelerating Diffusion Sampling via Exploiting Local Transition Coherence
Shangwen Zhu
Han Zhang
Zhantao Yang
Qianyu Peng
Zhao Pu
H. Wang
Fan Cheng
DiffM
48
0
0
12 Mar 2025
AHCPTQ: Accurate and Hardware-Compatible Post-Training Quantization for Segment Anything Model
Wenlun Zhang
Shimpei Ando
Kentaro Yoshioka
VLM
MQ
67
0
0
05 Mar 2025
Exploring Model Quantization in GenAI-based Image Inpainting and Detection of Arable Plants
Sourav Modak
Ahmet Oğuz Saltık
Anthony Stein
MQ
48
0
0
04 Mar 2025
SpinQuant: LLM quantization with learned rotations
SpinQuant: LLM quantization with learned rotations
Zechun Liu
Changsheng Zhao
Igor Fedorov
Bilge Soran
Dhruv Choudhary
Raghuraman Krishnamoorthi
Vikas Chandra
Yuandong Tian
Tijmen Blankevoort
MQ
127
84
0
21 Feb 2025
ParetoQ: Scaling Laws in Extremely Low-bit LLM Quantization
ParetoQ: Scaling Laws in Extremely Low-bit LLM Quantization
Zechun Liu
Changsheng Zhao
Hanxian Huang
Sijia Chen
Jing Zhang
...
Yuandong Tian
Bilge Soran
Raghuraman Krishnamoorthi
Tijmen Blankevoort
Vikas Chandra
MQ
73
3
0
04 Feb 2025
Implicit Bias in Matrix Factorization and its Explicit Realization in a New Architecture
Yikun Hou
Suvrit Sra
A. Yurtsever
29
0
0
28 Jan 2025
Semantics Prompting Data-Free Quantization for Low-Bit Vision Transformers
Semantics Prompting Data-Free Quantization for Low-Bit Vision Transformers
Yunshan Zhong
Yuyao Zhou
Yuxin Zhang
Shen Li
Yong Li
Fei Chao
Zhanpeng Zeng
Rongrong Ji
MQ
94
0
0
31 Dec 2024
Improving Quantization-aware Training of Low-Precision Network via Block
  Replacement on Full-Precision Counterpart
Improving Quantization-aware Training of Low-Precision Network via Block Replacement on Full-Precision Counterpart
Chengting Yu
Shu Yang
Fengzhao Zhang
Hanzhi Ma
Aili Wang
Er-ping Li
MQ
77
2
0
20 Dec 2024
PTSBench: A Comprehensive Post-Training Sparsity Benchmark Towards
  Algorithms and Models
PTSBench: A Comprehensive Post-Training Sparsity Benchmark Towards Algorithms and Models
Zining Wnag
J. Guo
Ruihao Gong
Yang Yong
Aishan Liu
Yushi Huang
Jiaheng Liu
X. Liu
71
1
0
10 Dec 2024
CPTQuant -- A Novel Mixed Precision Post-Training Quantization
  Techniques for Large Language Models
CPTQuant -- A Novel Mixed Precision Post-Training Quantization Techniques for Large Language Models
Amitash Nanda
Sree Bhargavi Balija
D. Sahoo
MQ
59
0
0
03 Dec 2024
Behavior Backdoor for Deep Learning Models
Behavior Backdoor for Deep Learning Models
J. T. Wang
Pengfei Zhang
R. Tao
Jian Yang
Hao Liu
X. Liu
Y. X. Wei
Yao Zhao
AAML
75
0
0
02 Dec 2024
Quantization-Aware Imitation-Learning for Resource-Efficient Robotic
  Control
Quantization-Aware Imitation-Learning for Resource-Efficient Robotic Control
Seongmin Park
Hyungmin Kim
Wonseok Jeon
Juyoung Yang
Byeongwook Jeon
Yoonseon Oh
Jungwook Choi
93
1
0
02 Dec 2024
Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for
  Quantized LLMs with 100T Training Tokens
Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens
Xu Ouyang
Tao Ge
Thomas Hartvigsen
Zhisong Zhang
Haitao Mi
Dong Yu
MQ
90
3
0
26 Nov 2024
Efficient Ternary Weight Embedding Model: Bridging Scalability and
  Performance
Efficient Ternary Weight Embedding Model: Bridging Scalability and Performance
Jiayi Chen
Chen Wu
S. Zhang
Nan Li
L. Zhang
Qi Zhang
69
0
0
23 Nov 2024
Llama Guard 3-1B-INT4: Compact and Efficient Safeguard for Human-AI
  Conversations
Llama Guard 3-1B-INT4: Compact and Efficient Safeguard for Human-AI Conversations
Igor Fedorov
Kate Plawiak
Lemeng Wu
Tarek Elgamal
Naveen Suda
...
Bilge Soran
Zacharie Delpierre Coudert
Rachad Alao
Raghuraman Krishnamoorthi
Vikas Chandra
72
4
0
18 Nov 2024
Data Generation for Hardware-Friendly Post-Training Quantization
Data Generation for Hardware-Friendly Post-Training Quantization
Lior Dikstein
Ariel Lapid
Arnon Netzer
H. Habi
MQ
148
0
0
29 Oct 2024
Channel-Wise Mixed-Precision Quantization for Large Language Models
Channel-Wise Mixed-Precision Quantization for Large Language Models
Zihan Chen
Bike Xie
Jundong Li
Cong Shen
MQ
27
2
0
16 Oct 2024
PrivQuant: Communication-Efficient Private Inference with Quantized
  Network/Protocol Co-Optimization
PrivQuant: Communication-Efficient Private Inference with Quantized Network/Protocol Co-Optimization
Tianshi Xu
Shuzhang Zhong
Wenxuan Zeng
Runsheng Wang
Meng Li
MQ
29
0
0
12 Oct 2024
Continuous Approximations for Improving Quantization Aware Training of
  LLMs
Continuous Approximations for Improving Quantization Aware Training of LLMs
He Li
Jianhang Hong
Yuanzhuo Wu
Snehal Adbol
Zonglin Li
MQ
21
1
0
06 Oct 2024
Constraint Guided Model Quantization of Neural Networks
Constraint Guided Model Quantization of Neural Networks
Quinten Van Baelen
P. Karsmakers
MQ
26
0
0
30 Sep 2024
Efficient Federated Intrusion Detection in 5G ecosystem using optimized
  BERT-based model
Efficient Federated Intrusion Detection in 5G ecosystem using optimized BERT-based model
Frederic Adjewa
Moez Esseghir
Leila Merghem-Boulahia
35
5
0
28 Sep 2024
SPAQ-DL-SLAM: Towards Optimizing Deep Learning-based SLAM for
  Resource-Constrained Embedded Platforms
SPAQ-DL-SLAM: Towards Optimizing Deep Learning-based SLAM for Resource-Constrained Embedded Platforms
Niraj Pudasaini
Muhammad Abdullah Hanif
Muhammad Shafique
26
0
0
22 Sep 2024
Thinking in Granularity: Dynamic Quantization for Image Super-Resolution
  by Intriguing Multi-Granularity Clues
Thinking in Granularity: Dynamic Quantization for Image Super-Resolution by Intriguing Multi-Granularity Clues
Mingshen Wang
Zhao Zhang
Feng Li
Ke Xu
Kang Miao
Meng Wang
MQ
SupR
38
1
0
22 Sep 2024
Sampling Latent Material-Property Information From LLM-Derived Embedding
  Representations
Sampling Latent Material-Property Information From LLM-Derived Embedding Representations
Luke P J Gilligan
M. Cobelli
Hasan M. Sayeed
Taylor D. Sparks
Stefano Sanvito
19
0
0
18 Sep 2024
Privacy-Preserving SAM Quantization for Efficient Edge Intelligence in
  Healthcare
Privacy-Preserving SAM Quantization for Efficient Edge Intelligence in Healthcare
Zhikai Li
Jing Zhang
Qingyi Gu
MedIm
36
1
0
14 Sep 2024
Computer Vision Model Compression Techniques for Embedded Systems: A
  Survey
Computer Vision Model Compression Techniques for Embedded Systems: A Survey
Alexandre Lopes
Fernando Pereira dos Santos
D. Oliveira
Mauricio Schiezaro
Hélio Pedrini
28
5
0
15 Aug 2024
DopQ-ViT: Towards Distribution-Friendly and Outlier-Aware Post-Training
  Quantization for Vision Transformers
DopQ-ViT: Towards Distribution-Friendly and Outlier-Aware Post-Training Quantization for Vision Transformers
Lianwei Yang
Haisong Gong
Qingyi Gu
MQ
32
3
0
06 Aug 2024
Reclaiming Residual Knowledge: A Novel Paradigm to Low-Bit Quantization
Reclaiming Residual Knowledge: A Novel Paradigm to Low-Bit Quantization
Róisín Luo
Alexandru Drimbarean
Walsh Simon
Colm O'Riordan
MQ
31
0
0
01 Aug 2024
Pixel Embedding: Fully Quantized Convolutional Neural Network with
  Differentiable Lookup Table
Pixel Embedding: Fully Quantized Convolutional Neural Network with Differentiable Lookup Table
Hiroyuki Tokunaga
Joel Nicholls
Daria Vazhenina
Atsunori Kanemura
MQ
16
1
0
23 Jul 2024
Comprehensive Study on Performance Evaluation and Optimization of Model
  Compression: Bridging Traditional Deep Learning and Large Language Models
Comprehensive Study on Performance Evaluation and Optimization of Model Compression: Bridging Traditional Deep Learning and Large Language Models
Aayush Saxena
Arit Kumar Bishwas
Ayush Ashok Mishra
Ryan Armstrong
19
1
0
22 Jul 2024
Compensate Quantization Errors+: Quantized Models Are Inquisitive Learners
Compensate Quantization Errors+: Quantized Models Are Inquisitive Learners
Yifei Gao
Jie Ou
Lei Wang
Fanhua Shang
Jaji Wu
MQ
45
0
0
22 Jul 2024
AdaLog: Post-Training Quantization for Vision Transformers with Adaptive
  Logarithm Quantizer
AdaLog: Post-Training Quantization for Vision Transformers with Adaptive Logarithm Quantizer
Zhuguanyu Wu
Jiaxin Chen
Hanwen Zhong
Di Huang
Yun Wang
MQ
40
9
0
17 Jul 2024
NITRO-D: Native Integer-only Training of Deep Convolutional Neural
  Networks
NITRO-D: Native Integer-only Training of Deep Convolutional Neural Networks
Alberto Pirillo
Luca Colombo
Manuel Roveri
MQ
29
0
0
16 Jul 2024
On-Device Training of Fully Quantized Deep Neural Networks on Cortex-M
  Microcontrollers
On-Device Training of Fully Quantized Deep Neural Networks on Cortex-M Microcontrollers
M. Deutel
Frank Hannig
Christopher Mutschler
Jürgen Teich
MQ
25
0
0
15 Jul 2024
LeanQuant: Accurate Large Language Model Quantization with
  Loss-Error-Aware Grid
LeanQuant: Accurate Large Language Model Quantization with Loss-Error-Aware Grid
Tianyi Zhang
Anshumali Shrivastava
MQ
31
4
0
14 Jul 2024
1234...8910
Next