ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.14156
  4. Cited By
Post-Training Quantization for Vision Transformer

Post-Training Quantization for Vision Transformer

27 June 2021
Zhenhua Liu
Yunhe Wang
Kai Han
Siwei Ma
Wen Gao
    ViT
    MQ
ArXivPDFHTML

Papers citing "Post-Training Quantization for Vision Transformer"

50 / 192 papers shown
Title
LLM-FP4: 4-Bit Floating-Point Quantized Transformers
LLM-FP4: 4-Bit Floating-Point Quantized Transformers
Shih-yang Liu
Zechun Liu
Xijie Huang
Pingcheng Dong
Kwang-Ting Cheng
MQ
11
55
0
25 Oct 2023
USDC: Unified Static and Dynamic Compression for Visual Transformer
USDC: Unified Static and Dynamic Compression for Visual Transformer
Huan Yuan
Chao Liao
Jianchao Tan
Peng Yao
Jiyuan Jia
Bin Chen
Chengru Song
Di Zhang
ViT
11
0
0
17 Oct 2023
One-Shot Sensitivity-Aware Mixed Sparsity Pruning for Large Language
  Models
One-Shot Sensitivity-Aware Mixed Sparsity Pruning for Large Language Models
Hang Shao
Bei Liu
Bo Xiao
Ke Zeng
Guanglu Wan
Yanmin Qian
42
17
0
14 Oct 2023
QuATON: Quantization Aware Training of Optical Neurons
QuATON: Quantization Aware Training of Optical Neurons
Hasindu Kariyawasam
Ramith Hettiarachchi
Quansan Yang
Alex Matlock
Takahiro Nambara
Hiroyuki Kusaka
Yuichiro Kunai
Peter T C So
Edward S Boyden
D. Wadduwage
MQ
16
1
0
04 Oct 2023
Compressing LLMs: The Truth is Rarely Pure and Never Simple
Compressing LLMs: The Truth is Rarely Pure and Never Simple
Ajay Jaiswal
Zhe Gan
Xianzhi Du
Bowen Zhang
Zhangyang Wang
Yinfei Yang
MQ
31
45
0
02 Oct 2023
YFlows: Systematic Dataflow Exploration and Code Generation for
  Efficient Neural Network Inference using SIMD Architectures on CPUs
YFlows: Systematic Dataflow Exploration and Code Generation for Efficient Neural Network Inference using SIMD Architectures on CPUs
Cyrus Zhou
Zack Hassman
Ruize Xu
Dhirpal Shah
Vaughn Richard
Yanjing Li
27
1
0
01 Oct 2023
A Precision-Scalable RISC-V DNN Processor with On-Device Learning
  Capability at the Extreme Edge
A Precision-Scalable RISC-V DNN Processor with On-Device Learning Capability at the Extreme Edge
Longwei Huang
Chao Fang
Qiong Li
Jun Lin
Zhongfeng Wang
79
10
0
15 Sep 2023
Optimize Weight Rounding via Signed Gradient Descent for the
  Quantization of LLMs
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs
Wenhua Cheng
Weiwei Zhang
Haihao Shen
Yiyang Cai
Xin He
Kaokao Lv
Yi. Liu
MQ
19
21
0
11 Sep 2023
Compressing Vision Transformers for Low-Resource Visual Learning
Compressing Vision Transformers for Low-Resource Visual Learning
Eric Youn
J. SaiMitheran
Sanjana Prabhu
Siyuan Chen
ViT
19
2
0
05 Sep 2023
A survey on efficient vision transformers: algorithms, techniques, and
  performance benchmarking
A survey on efficient vision transformers: algorithms, techniques, and performance benchmarking
Lorenzo Papa
Paolo Russo
Irene Amerini
Luping Zhou
20
41
0
05 Sep 2023
Jumping through Local Minima: Quantization in the Loss Landscape of
  Vision Transformers
Jumping through Local Minima: Quantization in the Loss Landscape of Vision Transformers
N. Frumkin
Dibakar Gope
Diana Marculescu
MQ
22
15
0
21 Aug 2023
QD-BEV : Quantization-aware View-guided Distillation for Multi-view 3D
  Object Detection
QD-BEV : Quantization-aware View-guided Distillation for Multi-view 3D Object Detection
Yifan Zhang
Zhen Dong
Huanrui Yang
Ming Lu
Cheng-Ching Tseng
Yuan Du
Kurt Keutzer
Li Du
Shanghang Zhang
MQ
17
9
0
21 Aug 2023
A Survey on Model Compression for Large Language Models
A Survey on Model Compression for Large Language Models
Xunyu Zhu
Jian Li
Yong Liu
Can Ma
Weiping Wang
24
189
0
15 Aug 2023
Dynamic Token-Pass Transformers for Semantic Segmentation
Dynamic Token-Pass Transformers for Semantic Segmentation
Yuang Liu
Qiang Zhou
Jing Wang
Fan Wang
J. Wang
Wei Zhang
ViT
12
4
0
03 Aug 2023
Survey on Computer Vision Techniques for Internet-of-Things Devices
Survey on Computer Vision Techniques for Internet-of-Things Devices
Ishmeet Kaur
Adwaita Janardhan Jadhav
AI4CE
6
1
0
02 Aug 2023
Revisiting the Parameter Efficiency of Adapters from the Perspective of
  Precision Redundancy
Revisiting the Parameter Efficiency of Adapters from the Perspective of Precision Redundancy
Shibo Jie
Haoqing Wang
Zhiwei Deng
11
31
0
31 Jul 2023
QuIP: 2-Bit Quantization of Large Language Models With Guarantees
QuIP: 2-Bit Quantization of Large Language Models With Guarantees
Jerry Chee
Yaohui Cai
Volodymyr Kuleshov
Chris De Sa
MQ
20
186
0
25 Jul 2023
A Model for Every User and Budget: Label-Free and Personalized
  Mixed-Precision Quantization
A Model for Every User and Budget: Label-Free and Personalized Mixed-Precision Quantization
Edward Fish
Umberto Michieli
Mete Ozay
MQ
14
4
0
24 Jul 2023
Digital Modeling on Large Kernel Metamaterial Neural Network
Digital Modeling on Large Kernel Metamaterial Neural Network
Quan Liu
Hanyu Zheng
Brandon T. Swartz
Ho Hin Lee
Zuhayr Asad
I. Kravchenko
Jason G Valentine
Yuankai Huo
8
4
0
21 Jul 2023
Learned Thresholds Token Merging and Pruning for Vision Transformers
Learned Thresholds Token Merging and Pruning for Vision Transformers
Maxim Bonnaerens
J. Dambre
22
15
0
20 Jul 2023
Quantized Feature Distillation for Network Quantization
Quantized Feature Distillation for Network Quantization
Kevin Zhu
Yin He
Jianxin Wu
MQ
24
8
0
20 Jul 2023
A Survey of Techniques for Optimizing Transformer Inference
A Survey of Techniques for Optimizing Transformer Inference
Krishna Teja Chitty-Venkata
Sparsh Mittal
M. Emani
V. Vishwanath
Arun Somani
29
62
0
16 Jul 2023
Make A Long Image Short: Adaptive Token Length for Vision Transformers
Make A Long Image Short: Adaptive Token Length for Vision Transformers
Yuqin Zhu
Yichen Zhu
ViT
57
17
0
05 Jul 2023
Data-Free Quantization via Mixed-Precision Compensation without Fine-Tuning
Data-Free Quantization via Mixed-Precision Compensation without Fine-Tuning
Jun Chen
Shipeng Bai
Tianxin Huang
Mengmeng Wang
Guanzhong Tian
Y. Liu
MQ
27
17
0
02 Jul 2023
BinaryViT: Pushing Binary Vision Transformers Towards Convolutional
  Models
BinaryViT: Pushing Binary Vision Transformers Towards Convolutional Models
Phuoc-Hoan Charles Le
Xinlin Li
ViT
MQ
17
21
0
29 Jun 2023
Dynamic Perceiver for Efficient Visual Recognition
Dynamic Perceiver for Efficient Visual Recognition
Yizeng Han
Dongchen Han
Zeyu Liu
Yulin Wang
Xuran Pan
Yifan Pu
Chaorui Deng
Junlan Feng
S. Song
Gao Huang
14
29
0
20 Jun 2023
S$^{3}$: Increasing GPU Utilization during Generative Inference for
  Higher Throughput
S3^{3}3: Increasing GPU Utilization during Generative Inference for Higher Throughput
Yunho Jin
Chun-Feng Wu
David Brooks
Gu-Yeon Wei
23
62
0
09 Jun 2023
PreQuant: A Task-agnostic Quantization Approach for Pre-trained Language
  Models
PreQuant: A Task-agnostic Quantization Approach for Pre-trained Language Models
Zhuocheng Gong
Jiahao Liu
Qifan Wang
Yang Yang
Jingang Wang
Wei Yu Wu
Yunsen Xian
Dongyan Zhao
Rui Yan
MQ
33
5
0
30 May 2023
Towards Accurate Post-training Quantization for Diffusion Models
Towards Accurate Post-training Quantization for Diffusion Models
Changyuan Wang
Ziwei Wang
Xiuwei Xu
Yansong Tang
Jie Zhou
Jiwen Lu
MQ
25
19
0
30 May 2023
DiffRate : Differentiable Compression Rate for Efficient Vision
  Transformers
DiffRate : Differentiable Compression Rate for Efficient Vision Transformers
Mengzhao Chen
Wenqi Shao
Peng Xu
Mingbao Lin
Kaipeng Zhang
Fei Chao
Rongrong Ji
Yu Qiao
Ping Luo
ViT
34
43
0
29 May 2023
Pre-RMSNorm and Pre-CRMSNorm Transformers: Equivalent and Efficient
  Pre-LN Transformers
Pre-RMSNorm and Pre-CRMSNorm Transformers: Equivalent and Efficient Pre-LN Transformers
Zixuan Jiang
Jiaqi Gu
Hanqing Zhu
D. Pan
AI4CE
19
16
0
24 May 2023
BinaryViT: Towards Efficient and Accurate Binary Vision Transformers
Junrui Xiao
Zhikai Li
Lianwei Yang
Qingyi Gu
MQ
ViT
30
2
0
24 May 2023
Bi-ViT: Pushing the Limit of Vision Transformer Quantization
Bi-ViT: Pushing the Limit of Vision Transformer Quantization
Yanjing Li
Sheng Xu
Mingbao Lin
Xianbin Cao
Chuanjian Liu
Xiao Sun
Baochang Zhang
ViT
MQ
47
11
0
21 May 2023
A Survey of Safety and Trustworthiness of Large Language Models through
  the Lens of Verification and Validation
A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation
Xiaowei Huang
Wenjie Ruan
Wei Huang
Gao Jin
Yizhen Dong
...
Sihao Wu
Peipei Xu
Dengyu Wu
André Freitas
Mustafa A. Mustafa
ALM
27
81
0
19 May 2023
Whisper-KDQ: A Lightweight Whisper via Guided Knowledge Distillation and
  Quantization for Efficient ASR
Whisper-KDQ: A Lightweight Whisper via Guided Knowledge Distillation and Quantization for Efficient ASR
Hang Shao
Wei Wang
Bei Liu
Xun Gong
Haoyu Wang
Y. Qian
94
9
0
18 May 2023
Boost Vision Transformer with GPU-Friendly Sparsity and Quantization
Boost Vision Transformer with GPU-Friendly Sparsity and Quantization
Chong Yu
Tao Chen
Zhongxue Gan
Jiayuan Fan
MQ
ViT
15
21
0
18 May 2023
Patch-wise Mixed-Precision Quantization of Vision Transformer
Patch-wise Mixed-Precision Quantization of Vision Transformer
Junrui Xiao
Zhikai Li
Lianwei Yang
Qingyi Gu
MQ
22
12
0
11 May 2023
Transformer-based models and hardware acceleration analysis in
  autonomous driving: A survey
Transformer-based models and hardware acceleration analysis in autonomous driving: A survey
J. Zhong
Zheng Liu
Xiangshan Chen
ViT
28
16
0
21 Apr 2023
Q-DETR: An Efficient Low-Bit Quantized Detection Transformer
Q-DETR: An Efficient Low-Bit Quantized Detection Transformer
Sheng Xu
Yanjing Li
Mingbao Lin
Penglei Gao
Guodong Guo
Jinhu Lu
Baochang Zhang
MQ
13
22
0
01 Apr 2023
Towards Accurate Post-Training Quantization for Vision Transformer
Towards Accurate Post-Training Quantization for Vision Transformer
Yifu Ding
Haotong Qin
Qing-Yu Yan
Z. Chai
Junjie Liu
Xiaolin K. Wei
Xianglong Liu
MQ
54
66
0
25 Mar 2023
Scaled Quantization for the Vision Transformer
Scaled Quantization for the Vision Transformer
Yangyang Chang
G. E. Sobelman
MQ
11
1
0
23 Mar 2023
Q-HyViT: Post-Training Quantization of Hybrid Vision Transformers with
  Bridge Block Reconstruction for IoT Systems
Q-HyViT: Post-Training Quantization of Hybrid Vision Transformers with Bridge Block Reconstruction for IoT Systems
Jemin Lee
Yongin Kwon
Sihyeong Park
Misun Yu
Jeman Park
Hwanjun Song
ViT
MQ
14
5
0
22 Mar 2023
Teacher Intervention: Improving Convergence of Quantization Aware
  Training for Ultra-Low Precision Transformers
Teacher Intervention: Improving Convergence of Quantization Aware Training for Ultra-Low Precision Transformers
Minsoo Kim
Kyuhong Shim
Seongmin Park
Wonyong Sung
Jungwook Choi
MQ
11
1
0
23 Feb 2023
Optical Transformers
Optical Transformers
Maxwell G. Anderson
Shifan Ma
Tianyu Wang
Logan G. Wright
Peter L. McMahon
12
20
0
20 Feb 2023
A Theoretical Understanding of Shallow Vision Transformers: Learning,
  Generalization, and Sample Complexity
A Theoretical Understanding of Shallow Vision Transformers: Learning, Generalization, and Sample Complexity
Hongkang Li
M. Wang
Sijia Liu
Pin-Yu Chen
ViT
MLT
32
56
0
12 Feb 2023
Oscillation-free Quantization for Low-bit Vision Transformers
Oscillation-free Quantization for Low-bit Vision Transformers
Shi Liu
Zechun Liu
Kwang-Ting Cheng
MQ
8
33
0
04 Feb 2023
Knowledge Distillation in Vision Transformers: A Critical Review
Knowledge Distillation in Vision Transformers: A Critical Review
Gousia Habib
Tausifa Jan Saleem
Brejesh Lall
11
15
0
04 Feb 2023
Understanding INT4 Quantization for Transformer Models: Latency Speedup,
  Composability, and Failure Cases
Understanding INT4 Quantization for Transformer Models: Latency Speedup, Composability, and Failure Cases
Xiaoxia Wu
Cheng-rong Li
Reza Yazdani Aminabadi
Z. Yao
Yuxiong He
MQ
11
19
0
27 Jan 2023
RepQ-ViT: Scale Reparameterization for Post-Training Quantization of
  Vision Transformers
RepQ-ViT: Scale Reparameterization for Post-Training Quantization of Vision Transformers
Zhikai Li
Junrui Xiao
Lianwei Yang
Qingyi Gu
MQ
15
80
0
16 Dec 2022
Rethinking Vision Transformers for MobileNet Size and Speed
Rethinking Vision Transformers for MobileNet Size and Speed
Yanyu Li
Ju Hu
Yang Wen
Georgios Evangelidis
Kamyar Salahi
Yanzhi Wang
Sergey Tulyakov
Jian Ren
ViT
21
157
0
15 Dec 2022
Previous
1234
Next