ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.15701
  4. Cited By
BinaryBERT: Pushing the Limit of BERT Quantization

BinaryBERT: Pushing the Limit of BERT Quantization

31 December 2020
Haoli Bai
Wei Zhang
Lu Hou
Lifeng Shang
Jing Jin
Xin Jiang
Qun Liu
Michael Lyu
Irwin King
    MQ
ArXivPDFHTML

Papers citing "BinaryBERT: Pushing the Limit of BERT Quantization"

17 / 17 papers shown
Title
COBRA: Algorithm-Architecture Co-optimized Binary Transformer Accelerator for Edge Inference
COBRA: Algorithm-Architecture Co-optimized Binary Transformer Accelerator for Edge Inference
Ye Qiao
Zhiheng Cheng
Yian Wang
Yifan Zhang
Yunzhe Deng
Sitao Huang
70
0
0
22 Apr 2025
LowRA: Accurate and Efficient LoRA Fine-Tuning of LLMs under 2 Bits
LowRA: Accurate and Efficient LoRA Fine-Tuning of LLMs under 2 Bits
Zikai Zhou
Qizheng Zhang
Hermann Kumbong
Kunle Olukotun
MQ
115
0
0
12 Feb 2025
BEEM: Boosting Performance of Early Exit DNNs using Multi-Exit Classifiers as Experts
BEEM: Boosting Performance of Early Exit DNNs using Multi-Exit Classifiers as Experts
Divya J. Bajpai
M. Hanawal
63
0
0
02 Feb 2025
HadamRNN: Binary and Sparse Ternary Orthogonal RNNs
HadamRNN: Binary and Sparse Ternary Orthogonal RNNs
Armand Foucault
Franck Mamalet
François Malgouyres
MQ
64
0
0
28 Jan 2025
FlatQuant: Flatness Matters for LLM Quantization
FlatQuant: Flatness Matters for LLM Quantization
Yuxuan Sun
Ruikang Liu
Haoli Bai
Han Bao
Kang Zhao
...
Lu Hou
Chun Yuan
Xin Jiang
W. Liu
Jun Yao
MQ
38
3
0
12 Oct 2024
MoDeGPT: Modular Decomposition for Large Language Model Compression
MoDeGPT: Modular Decomposition for Large Language Model Compression
Chi-Heng Lin
Shangqian Gao
James Seale Smith
Abhishek Patel
Shikhar Tuli
Yilin Shen
Hongxia Jin
Yen-Chang Hsu
65
6
0
19 Aug 2024
Accelerating Large Language Model Inference with Self-Supervised Early
  Exits
Accelerating Large Language Model Inference with Self-Supervised Early Exits
Florian Valade
LRM
26
1
0
30 Jul 2024
BOLD: Boolean Logic Deep Learning
BOLD: Boolean Logic Deep Learning
Van Minh Nguyen
Cristian Ocampo
Aymen Askri
Louis Leconte
Ba-Hien Tran
AI4CE
16
0
0
25 May 2024
Jumping through Local Minima: Quantization in the Loss Landscape of
  Vision Transformers
Jumping through Local Minima: Quantization in the Loss Landscape of Vision Transformers
N. Frumkin
Dibakar Gope
Diana Marculescu
MQ
16
15
0
21 Aug 2023
LLM-QAT: Data-Free Quantization Aware Training for Large Language Models
LLM-QAT: Data-Free Quantization Aware Training for Large Language Models
Zechun Liu
Barlas Oğuz
Changsheng Zhao
Ernie Chang
Pierre Stock
Yashar Mehdad
Yangyang Shi
Raghuraman Krishnamoorthi
Vikas Chandra
MQ
20
186
0
29 May 2023
F-PABEE: Flexible-patience-based Early Exiting for Single-label and
  Multi-label text Classification Tasks
F-PABEE: Flexible-patience-based Early Exiting for Single-label and Multi-label text Classification Tasks
Xiangxiang Gao
Wei-wei Zhu
Jiasheng Gao
Congrui Yin
VLM
11
12
0
21 May 2023
CPT-V: A Contrastive Approach to Post-Training Quantization of Vision
  Transformers
CPT-V: A Contrastive Approach to Post-Training Quantization of Vision Transformers
N. Frumkin
Dibakar Gope
Diana Marculescu
ViT
MQ
13
1
0
17 Nov 2022
Compression of Generative Pre-trained Language Models via Quantization
Compression of Generative Pre-trained Language Models via Quantization
Chaofan Tao
Lu Hou
Wei Zhang
Lifeng Shang
Xin Jiang
Qun Liu
Ping Luo
Ngai Wong
MQ
11
103
0
21 Mar 2022
BERMo: What can BERT learn from ELMo?
BERMo: What can BERT learn from ELMo?
Sangamesh Kodge
Kaushik Roy
15
3
0
18 Oct 2021
Towards Efficient Post-training Quantization of Pre-trained Language
  Models
Towards Efficient Post-training Quantization of Pre-trained Language Models
Haoli Bai
Lu Hou
Lifeng Shang
Xin Jiang
Irwin King
M. Lyu
MQ
39
47
0
30 Sep 2021
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
214
505
0
12 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,927
0
20 Apr 2018
1