ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.14156
  4. Cited By
Post-Training Quantization for Vision Transformer

Post-Training Quantization for Vision Transformer

27 June 2021
Zhenhua Liu
Yunhe Wang
Kai Han
Siwei Ma
Wen Gao
    ViT
    MQ
ArXivPDFHTML

Papers citing "Post-Training Quantization for Vision Transformer"

50 / 192 papers shown
Title
ADFQ-ViT: Activation-Distribution-Friendly Post-Training Quantization
  for Vision Transformers
ADFQ-ViT: Activation-Distribution-Friendly Post-Training Quantization for Vision Transformers
Yanfeng Jiang
Ning Sun
Xueshuo Xie
Fei Yang
Tao Li
MQ
31
2
0
03 Jul 2024
Pruning One More Token is Enough: Leveraging Latency-Workload
  Non-Linearities for Vision Transformers on the Edge
Pruning One More Token is Enough: Leveraging Latency-Workload Non-Linearities for Vision Transformers on the Edge
Nick Eliopoulos
Purvish Jajal
James Davis
Gaowen Liu
George K. Thiravathukal
Yung-Hsiang Lu
36
1
0
01 Jul 2024
LEMoE: Advanced Mixture of Experts Adaptor for Lifelong Model Editing of
  Large Language Models
LEMoE: Advanced Mixture of Experts Adaptor for Lifelong Model Editing of Large Language Models
Renzhi Wang
Piji Li
KELM
CLL
37
7
0
28 Jun 2024
Single Parent Family: A Spectrum of Family Members from a Single
  Pre-Trained Foundation Model
Single Parent Family: A Spectrum of Family Members from a Single Pre-Trained Foundation Model
Habib Hajimolahoseini
Mohammad Hassanpour
Foozhan Ataiefard
Boxing Chen
Yang Liu
23
2
0
28 Jun 2024
PEANO-ViT: Power-Efficient Approximations of Non-Linearities in Vision
  Transformers
PEANO-ViT: Power-Efficient Approximations of Non-Linearities in Vision Transformers
Mohammad Erfan Sadeghi
A. Fayyazi
Seyedarmin Azizi
Massoud Pedram
27
8
0
21 Jun 2024
MGRQ: Post-Training Quantization For Vision Transformer With Mixed
  Granularity Reconstruction
MGRQ: Post-Training Quantization For Vision Transformer With Mixed Granularity Reconstruction
Lianwei Yang
Zhikai Li
Junrui Xiao
Haisong Gong
Qingyi Gu
MQ
25
3
0
13 Jun 2024
Convolutional Neural Networks and Vision Transformers for Fashion MNIST
  Classification: A Literature Review
Convolutional Neural Networks and Vision Transformers for Fashion MNIST Classification: A Literature Review
Sonia Bbouzidi
Ghazala Hcini
Imen Jdey
Fadoua Drira
16
4
0
05 Jun 2024
SSNet: A Lightweight Multi-Party Computation Scheme for Practical
  Privacy-Preserving Machine Learning Service in the Cloud
SSNet: A Lightweight Multi-Party Computation Scheme for Practical Privacy-Preserving Machine Learning Service in the Cloud
Shijin Duan
Chenghong Wang
Hongwu Peng
Yukui Luo
Wujie Wen
Caiwen Ding
Xiaolin Xu
28
5
0
04 Jun 2024
ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
Tianchen Zhao
Tongcheng Fang
Haofeng Huang
Enshu Liu
Widyadewi Soedarmadji
...
Shengen Yan
Huazhong Yang
Xuefei Ning
Xuefei Ning
Yu Wang
MQ
VGen
97
22
0
04 Jun 2024
P$^2$-ViT: Power-of-Two Post-Training Quantization and Acceleration for
  Fully Quantized Vision Transformer
P2^22-ViT: Power-of-Two Post-Training Quantization and Acceleration for Fully Quantized Vision Transformer
Huihong Shi
Xin Cheng
Wendong Mao
Zhongfeng Wang
MQ
40
3
0
30 May 2024
Large Language Model Pruning
Large Language Model Pruning
Hanjuan Huang
Hao-Jia Song
H. Pao
33
0
0
24 May 2024
WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of
  Large Language Models
WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models
Peng Wang
Zexi Li
Ningyu Zhang
Ziwen Xu
Yunzhi Yao
Yong-jia Jiang
Pengjun Xie
Fei Huang
Huajun Chen
KELM
CLL
45
20
0
23 May 2024
Two Heads are Better Than One: Neural Networks Quantization with 2D
  Hilbert Curve-based Output Representation
Two Heads are Better Than One: Neural Networks Quantization with 2D Hilbert Curve-based Output Representation
Mykhail M. Uss
Ruslan Yermolenko
Olena Kolodiazhna
Oleksii Shashko
Ivan Safonov
Volodymyr Savin
Yoonjae Yeo
Seowon Ji
Jaeyun Jeong
MQ
25
0
0
22 May 2024
Efficient Multimodal Large Language Models: A Survey
Efficient Multimodal Large Language Models: A Survey
Yizhang Jin
Jian Li
Yexin Liu
Tianjun Gu
Kai Wu
...
Xin Tan
Zhenye Gan
Yabiao Wang
Chengjie Wang
Lizhuang Ma
LRM
39
45
0
17 May 2024
Selective Focus: Investigating Semantics Sensitivity in Post-training
  Quantization for Lane Detection
Selective Focus: Investigating Semantics Sensitivity in Post-training Quantization for Lane Detection
Yunqian Fan
Xiuying Wei
Ruihao Gong
Yuqing Ma
Xiangguo Zhang
Qi Zhang
Xianglong Liu
MQ
24
2
0
10 May 2024
From Algorithm to Hardware: A Survey on Efficient and Safe Deployment of
  Deep Neural Networks
From Algorithm to Hardware: A Survey on Efficient and Safe Deployment of Deep Neural Networks
Xue Geng
Zhe Wang
Chunyun Chen
Qing Xu
Kaixin Xu
...
Zhenghua Chen
M. Aly
Jie Lin
Min-man Wu
Xiaoli Li
31
1
0
09 May 2024
Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free
  Efficient Vision Transformer
Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision Transformer
Huihong Shi
Haikuo Shao
Wendong Mao
Zhongfeng Wang
ViT
MQ
36
3
0
06 May 2024
Model Quantization and Hardware Acceleration for Vision Transformers: A
  Comprehensive Survey
Model Quantization and Hardware Acceleration for Vision Transformers: A Comprehensive Survey
Dayou Du
Gu Gong
Xiaowen Chu
MQ
32
7
0
01 May 2024
SNP: Structured Neuron-level Pruning to Preserve Attention Scores
SNP: Structured Neuron-level Pruning to Preserve Attention Scores
Kyunghwan Shim
Jaewoong Yun
Shinkook Choi
25
0
0
18 Apr 2024
Comprehensive Survey of Model Compression and Speed up for Vision
  Transformers
Comprehensive Survey of Model Compression and Speed up for Vision Transformers
Feiyang Chen
Ziqian Luo
Lisang Zhou
Xueting Pan
Ying Jiang
14
22
0
16 Apr 2024
Test-Time Model Adaptation with Only Forward Passes
Test-Time Model Adaptation with Only Forward Passes
Shuaicheng Niu
Chunyan Miao
Guohao Chen
Pengcheng Wu
Peilin Zhao
TTA
34
18
0
02 Apr 2024
Instance-Aware Group Quantization for Vision Transformers
Instance-Aware Group Quantization for Vision Transformers
Jaehyeon Moon
Dohyung Kim
Junyong Cheon
Bumsub Ham
MQ
ViT
27
5
0
01 Apr 2024
QNCD: Quantization Noise Correction for Diffusion Models
QNCD: Quantization Noise Correction for Diffusion Models
Huanpeng Chu
Wei Wu
Chengjie Zang
Kun Yuan
DiffM
MQ
29
4
0
28 Mar 2024
DaCapo: Accelerating Continuous Learning in Autonomous Systems for Video
  Analytics
DaCapo: Accelerating Continuous Learning in Autonomous Systems for Video Analytics
Yoonsung Kim
Changhun Oh
Jinwoo Hwang
Wonung Kim
Seongryong Oh
Yubin Lee
Hardik Sharma
Amir Yazdanbakhsh
Jongse Park
33
7
0
21 Mar 2024
AffineQuant: Affine Transformation Quantization for Large Language
  Models
AffineQuant: Affine Transformation Quantization for Large Language Models
Yuexiao Ma
Huixia Li
Xiawu Zheng
Feng Ling
Xuefeng Xiao
Rui Wang
Shilei Wen
Fei Chao
Rongrong Ji
MQ
38
17
0
19 Mar 2024
Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data
  Flow and Per-Block Quantization
Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block Quantization
Haocheng Xi
Yuxiang Chen
Kang Zhao
Kaijun Zheng
Jianfei Chen
Jun Zhu
MQ
37
19
0
19 Mar 2024
PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient
  Task Adaptation
PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation
Yizhe Xiong
Hui Chen
Tianxiang Hao
Zijia Lin
Jungong Han
Yuesong Zhang
Guoxin Wang
Yongjun Bao
Guiguang Ding
43
16
0
14 Mar 2024
COMQ: A Backpropagation-Free Algorithm for Post-Training Quantization
COMQ: A Backpropagation-Free Algorithm for Post-Training Quantization
Aozhong Zhang
Zi Yang
Naigang Wang
Yingyong Qin
Jack Xin
Xin Li
Penghang Yin
VLM
MQ
25
3
0
11 Mar 2024
FrameQuant: Flexible Low-Bit Quantization for Transformers
FrameQuant: Flexible Low-Bit Quantization for Transformers
Harshavardhan Adepu
Zhanpeng Zeng
Li Zhang
Vikas Singh
MQ
32
5
0
10 Mar 2024
ShortGPT: Layers in Large Language Models are More Redundant Than You
  Expect
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect
Xin Men
Mingyu Xu
Qingyu Zhang
Bingning Wang
Hongyu Lin
Yaojie Lu
Xianpei Han
Weipeng Chen
25
103
0
06 Mar 2024
Understanding Neural Network Binarization with Forward and Backward
  Proximal Quantizers
Understanding Neural Network Binarization with Forward and Backward Proximal Quantizers
Yiwei Lu
Yaoliang Yu
Xinlin Li
Vahid Partovi Nia
MQ
30
3
0
27 Feb 2024
Tiny Reinforcement Learning for Quadruped Locomotion using Decision
  Transformers
Tiny Reinforcement Learning for Quadruped Locomotion using Decision Transformers
Orhan Eren Akgün
Néstor Cuevas
Matheus Farias
Daniel Garces
28
0
0
20 Feb 2024
TransAxx: Efficient Transformers with Approximate Computing
TransAxx: Efficient Transformers with Approximate Computing
Dimitrios Danopoulos
Georgios Zervakis
Dimitrios Soudris
Jörg Henkel
ViT
42
2
0
12 Feb 2024
Memory-Efficient Vision Transformers: An Activation-Aware Mixed-Rank
  Compression Strategy
Memory-Efficient Vision Transformers: An Activation-Aware Mixed-Rank Compression Strategy
Seyedarmin Azizi
M. Nazemi
Massoud Pedram
ViT
MQ
38
2
0
08 Feb 2024
RepQuant: Towards Accurate Post-Training Quantization of Large
  Transformer Models via Scale Reparameterization
RepQuant: Towards Accurate Post-Training Quantization of Large Transformer Models via Scale Reparameterization
Zhikai Li
Xuewen Liu
Jing Zhang
Qingyi Gu
MQ
32
7
0
08 Feb 2024
A Survey on Transformer Compression
A Survey on Transformer Compression
Yehui Tang
Yunhe Wang
Jianyuan Guo
Zhijun Tu
Kai Han
Hailin Hu
Dacheng Tao
29
27
0
05 Feb 2024
Faster Inference of Integer SWIN Transformer by Removing the GELU
  Activation
Faster Inference of Integer SWIN Transformer by Removing the GELU Activation
Mohammadreza Tayaranian
S. H. Mozafari
James J. Clark
Brett H. Meyer
Warren Gross
26
1
0
02 Feb 2024
Lightweight Pixel Difference Networks for Efficient Visual
  Representation Learning
Lightweight Pixel Difference Networks for Efficient Visual Representation Learning
Z. Su
Jiehua Zhang
Longguang Wang
Hua Zhang
Zhen Liu
M. Pietikäinen
Li Liu
30
21
0
01 Feb 2024
ConSmax: Hardware-Friendly Alternative Softmax with Learnable Parameters
ConSmax: Hardware-Friendly Alternative Softmax with Learnable Parameters
Shiwei Liu
Guanchen Tao
Yifei Zou
Derek Chow
Zichen Fan
Kauna Lei
Bangfei Pan
Dennis Sylvester
Gregory Kielian
Mehdi Saligane
21
7
0
31 Jan 2024
MPTQ-ViT: Mixed-Precision Post-Training Quantization for Vision
  Transformer
MPTQ-ViT: Mixed-Precision Post-Training Quantization for Vision Transformer
Y. Tai
An-Yeu Wu
Wu
MQ
21
6
0
26 Jan 2024
OnDev-LCT: On-Device Lightweight Convolutional Transformers towards federated learning
OnDev-LCT: On-Device Lightweight Convolutional Transformers towards federated learning
Chu Myaet Thwal
Minh N. H. Nguyen
Ye Lin Tun
Seongjin Kim
My T. Thai
Choong Seon Hong
49
5
0
22 Jan 2024
LRP-QViT: Mixed-Precision Vision Transformer Quantization via Layer-wise
  Relevance Propagation
LRP-QViT: Mixed-Precision Vision Transformer Quantization via Layer-wise Relevance Propagation
Navin Ranjan
Andreas E. Savakis
MQ
19
6
0
20 Jan 2024
TinySAM: Pushing the Envelope for Efficient Segment Anything Model
TinySAM: Pushing the Envelope for Efficient Segment Anything Model
Han Shu
Wenshuo Li
Yehui Tang
Yiman Zhang
Yihao Chen
Houqiang Li
Yunhe Wang
Xinghao Chen
VLM
36
18
0
21 Dec 2023
SimQ-NAS: Simultaneous Quantization Policy and Neural Architecture
  Search
SimQ-NAS: Simultaneous Quantization Policy and Neural Architecture Search
S. N. Sridhar
Maciej Szankin
Fang Chen
Sairam Sundaresan
Anthony Sarah
MQ
19
0
0
19 Dec 2023
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric
  Strategy for Diverse Generative Tasks
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks
Xiaoxia Wu
Haojun Xia
Stephen Youn
Zhen Zheng
Shiyang Chen
...
Reza Yazdani Aminabadi
Yuxiong He
Olatunji Ruwase
Leon Song
Zhewei Yao
66
8
0
14 Dec 2023
CBQ: Cross-Block Quantization for Large Language Models
CBQ: Cross-Block Quantization for Large Language Models
Xin Ding
Xiaoyu Liu
Zhijun Tu
Yun-feng Zhang
Wei Li
...
Hanting Chen
Yehui Tang
Zhiwei Xiong
Baoqun Yin
Yunhe Wang
MQ
27
11
0
13 Dec 2023
GenQ: Quantization in Low Data Regimes with Generative Synthetic Data
GenQ: Quantization in Low Data Regimes with Generative Synthetic Data
Yuhang Li
Youngeun Kim
Donghyun Lee
Souvik Kundu
Priyadarshini Panda
MQ
20
2
0
07 Dec 2023
I&S-ViT: An Inclusive & Stable Method for Pushing the Limit of
  Post-Training ViTs Quantization
I&S-ViT: An Inclusive & Stable Method for Pushing the Limit of Post-Training ViTs Quantization
Yunshan Zhong
Jiawei Hu
Mingbao Lin
Mengzhao Chen
Rongrong Ji
MQ
28
3
0
16 Nov 2023
FLORA: Fine-grained Low-Rank Architecture Search for Vision Transformer
FLORA: Fine-grained Low-Rank Architecture Search for Vision Transformer
Chi-Chih Chang
Yuan-Yao Sung
Shixing Yu
N. Huang
Diana Marculescu
Kai-Chiang Wu
ViT
13
1
0
07 Nov 2023
MCUFormer: Deploying Vision Transformers on Microcontrollers with
  Limited Memory
MCUFormer: Deploying Vision Transformers on Microcontrollers with Limited Memory
Yinan Liang
Ziwei Wang
Xiuwei Xu
Yansong Tang
Jie Zhou
Jiwen Lu
18
9
0
25 Oct 2023
Previous
1234
Next