ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.11295
  4. Cited By
OneBit: Towards Extremely Low-bit Large Language Models
v1v2v3 (latest)

OneBit: Towards Extremely Low-bit Large Language Models

17 February 2024
Yuzhuang Xu
Xu Han
Zonghan Yang
Shuo Wang
Qingfu Zhu
Zhiyuan Liu
Weidong Liu
Wanxiang Che
    MQ
ArXiv (abs)PDFHTMLHuggingFace (25 upvotes)Github (204★)

Papers citing "OneBit: Towards Extremely Low-bit Large Language Models"

28 / 28 papers shown
R2Q: Towards Robust 2-Bit Large Language Models via Residual Refinement Quantization
R2Q: Towards Robust 2-Bit Large Language Models via Residual Refinement Quantization
Jiayi Chen
Jieqi Shi
Jing Huo
Chen Wu
MQ
210
0
0
21 Nov 2025
MACKO: Sparse Matrix-Vector Multiplication for Low Sparsity
MACKO: Sparse Matrix-Vector Multiplication for Low Sparsity
Vladimír Macko
Vladimír Boža
167
3
0
17 Nov 2025
Energy-Efficient and Dequantization-Free Q-LLMs: A Spiking Neural Network Approach to Salient Value Mitigation
Energy-Efficient and Dequantization-Free Q-LLMs: A Spiking Neural Network Approach to Salient Value Mitigation
Chenyu Wang
Zhanglu Yan
Zhi Zhou
Xu Chen
Weng-Fai Wong
MQ
203
2
0
22 Oct 2025
PT$^2$-LLM: Post-Training Ternarization for Large Language Models
PT2^22-LLM: Post-Training Ternarization for Large Language Models
Xianglong Yan
Chengzhu Bao
Zhiteng Li
Tianao Zhang
Kaicheng Yang
Haotong Qin
Ruobing Xie
Xingwu Sun
Yulun Zhang
MQ
314
0
0
27 Sep 2025
SDQ-LLM: Sigma-Delta Quantization for 1-bit LLMs of any size
SDQ-LLM: Sigma-Delta Quantization for 1-bit LLMs of any size
Junhao Xia
Ming Zhao
Limin Xiao
Xiujun Zhang
MQ
138
0
0
27 Sep 2025
APT-LLM: Exploiting Arbitrary-Precision Tensor Core Computing for LLM Acceleration
APT-LLM: Exploiting Arbitrary-Precision Tensor Core Computing for LLM AccelerationIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 2025
Shaobo Ma
Chao Fang
Haikuo Shao
Zhongfeng Wang
170
2
0
26 Aug 2025
Rethinking 1-bit Optimization Leveraging Pre-trained Large Language Models
Rethinking 1-bit Optimization Leveraging Pre-trained Large Language Models
Zhijun Tu
Hanting Chen
Siqi Liu
Chuanjian Liu
Jian Li
Jie Hu
Yunhe Wang
MQ
182
0
0
09 Aug 2025
SLED: A Speculative LLM Decoding Framework for Efficient Edge Serving
SLED: A Speculative LLM Decoding Framework for Efficient Edge Serving
Xiangchen Li
Dimitrios Spatharakis
Saeid Ghafouri
Jiakun Fan
Dimitrios Nikolopoulos
Deepu John
Bo Ji
Dimitrios S. Nikolopoulos
493
13
0
11 Jun 2025
MiniCPM4: Ultra-Efficient LLMs on End Devices
MiniCPM4: Ultra-Efficient LLMs on End Devices
MiniCPM Team
Chaojun Xiao
Yuxuan Li
Xu Han
Yuzhuo Bai
...
Zhiyuan Liu
Guoyang Zeng
Chao Jia
Dahai Li
Maosong Sun
MLLM
363
36
0
09 Jun 2025
MANBench: Is Your Multimodal Model Smarter than Human?
MANBench: Is Your Multimodal Model Smarter than Human?Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Han Zhou
Qitong Xu
Yiheng Dong
Xin Yang
281
1
0
04 Jun 2025
LittleBit: Ultra Low-Bit Quantization via Latent Factorization
LittleBit: Ultra Low-Bit Quantization via Latent Factorization
Banseok Lee
Dongkyu Kim
Youngcheon You
Youngmin Kim
MQ
349
11
0
30 May 2025
Highly Efficient and Effective LLMs with Multi-Boolean Architectures
Highly Efficient and Effective LLMs with Multi-Boolean Architectures
Ba-Hien Tran
Van Minh Nguyen
MQ
552
2
0
28 May 2025
Addition is almost all you need: Compressing large language models with double binary factorization
Addition is almost all you need: Compressing large language models with double binary factorization
Vladimír Boža
Vladimír Macko
MQ
590
4
0
16 May 2025
Enhancing Ultra-Low-Bit Quantization of Large Language Models Through Saliency-Aware Partial Retraining
Enhancing Ultra-Low-Bit Quantization of Large Language Models Through Saliency-Aware Partial RetrainingModeling Decisions for Artificial Intelligence (MDAI), 2025
Deyu Cao
Samin Aref
MQ
537
3
0
14 Apr 2025
Quantization Error Propagation: Revisiting Layer-Wise Post-Training Quantization
Quantization Error Propagation: Revisiting Layer-Wise Post-Training Quantization
Yamato Arai
Yuma Ichikawa
MQ
620
15
0
13 Apr 2025
When Reasoning Meets Compression: Understanding the Effects of LLMs Compression on Large Reasoning Models
When Reasoning Meets Compression: Understanding the Effects of LLMs Compression on Large Reasoning Models
Nan Zhang
Eugene Kwek
Yusen Zhang
Ngoc-Hieu Nguyen
Prasenjit Mitra
Rui Zhang
MQLRM
661
8
0
02 Apr 2025
Dynamic Low-Rank Sparse Adaptation for Large Language Models
Dynamic Low-Rank Sparse Adaptation for Large Language ModelsInternational Conference on Learning Representations (ICLR), 2025
Weizhong Huang
Yuxin Zhang
Xiawu Zheng
Wenshu Fan
Aiyue Chen
Yiwu Yao
Rongrong Ji
499
9
0
21 Feb 2025
Benchmarking Post-Training Quantization in LLMs: Comprehensive Taxonomy, Unified Evaluation, and Comparative Analysis
Benchmarking Post-Training Quantization in LLMs: Comprehensive Taxonomy, Unified Evaluation, and Comparative Analysis
Jiaqi Zhao
Ming Wang
Miao Zhang
Yuzhang Shang
Xuebo Liu
Yaowei Wang
Min Zhang
Liqiang Nie
MQ
777
7
0
18 Feb 2025
Progressive Binarization with Semi-Structured Pruning for LLMs
Progressive Binarization with Semi-Structured Pruning for LLMs
Xinyu Yan
Tianao Zhang
Zhiteng Li
Yulun Zhang
Yulun Zhang
MQ
692
5
0
03 Feb 2025
Anda: Unlocking Efficient LLM Inference with a Variable-Length Grouped
  Activation Data Format
Anda: Unlocking Efficient LLM Inference with a Variable-Length Grouped Activation Data FormatInternational Symposium on High-Performance Computer Architecture (HPCA), 2024
Chao Fang
Man Shi
Robin Geens
Arne Symons
Zhongfeng Wang
Marian Verhelst
503
15
0
24 Nov 2024
Bi-Mamba: Towards Accurate 1-Bit State Space Models
Shengkun Tang
Liqun Ma
Haoyang Li
Mingjie Sun
Zhiqiang Shen
Mamba
420
8
0
18 Nov 2024
Inverted Activations
Inverted Activations
Georgii Sergeevich Novikov
Ivan Oseledets
181
0
0
22 Jul 2024
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
Mengzhao Chen
Wenqi Shao
Peng Xu
Jiahao Wang
Shiyang Feng
Kaipeng Zhang
Ping Luo
MQ
674
112
0
10 Jul 2024
FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive
  Distillation
FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation
Liqun Ma
Mingjie Sun
Zhiqiang Shen
295
14
0
09 Jul 2024
T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge
T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge
Jianyu Wei
Shijie Cao
Ting Cao
Lingxiao Ma
Lei Wang
Yanyong Zhang
Mao Yang
MQ
502
52
0
25 Jun 2024
SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression
SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model CompressionInternational Conference on Learning Representations (ICLR), 2024
Xin Wang
Yu Zheng
Zhongwei Wan
Mi Zhang
MQ
635
207
0
12 Mar 2024
A Survey on Trustworthy Edge Intelligence: From Security and Reliability
  To Transparency and Sustainability
A Survey on Trustworthy Edge Intelligence: From Security and Reliability To Transparency and SustainabilityIEEE Communications Surveys and Tutorials (COMST), 2023
Xiaojie Wang
Beibei Wang
Yu Wu
Zhaolong Ning
Song Guo
Feng Yu
354
57
0
27 Oct 2023
A Survey on Model Compression for Large Language Models
A Survey on Model Compression for Large Language ModelsTransactions of the Association for Computational Linguistics (TACL), 2023
Xunyu Zhu
Jian Li
Yong Liu
Can Ma
Weiping Wang
475
417
0
15 Aug 2023
1
Page 1 of 1