ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Communities
  3. ...

Neighbor communities

0 / 0 papers shown
Title
Top Contributors
Name# Papers# Citations
Social Events
DateLocationEvent
  1. Home
  2. Communities
  3. MQ

Model Quantization

MQ
More data

Model Quantization is a technique used to reduce the size and computational requirements of machine learning models by representing weights and activations with lower precision. This is particularly useful for deploying models on resource-constrained devices, such as mobile phones and embedded systems.

Neighbor communities

51015

Featured Papers

0 / 0 papers shown
Title

All papers

50 / 2,955 papers shown
Title
SignRoundV2: Closing the Performance Gap in Extremely Low-Bit Post-Training Quantization for LLMs
SignRoundV2: Closing the Performance Gap in Extremely Low-Bit Post-Training Quantization for LLMs
Wenhua Cheng
Weiwei Zhang
Heng Guo
Haihao Shen
MQ
12
0
0
04 Dec 2025
BEP: A Binary Error Propagation Algorithm for Binary Neural Networks Training
BEP: A Binary Error Propagation Algorithm for Binary Neural Networks Training
Luca Colombo
Fabrizio Pittorino
Daniele Zambon
Carlo Baldassi
Manuel Roveri
Cesare Alippi
MQAAML
16
0
0
03 Dec 2025
ConvRot: Rotation-Based Plug-and-Play 4-bit Quantization for Diffusion Transformers
ConvRot: Rotation-Based Plug-and-Play 4-bit Quantization for Diffusion Transformers
Feice Huang
Zuliang Han
Xing Zhou
Yihuang Chen
Lifei Zhu
Haoqian Wang
MQ
8
0
0
03 Dec 2025
UniQL: Unified Quantization and Low-rank Compression for Adaptive Edge LLMs
UniQL: Unified Quantization and Low-rank Compression for Adaptive Edge LLMs
Hung-Yueh Chiang
Chi-Chih Chang
Yu-Chen Lu
Chien-Yu Lin
Kai-Chiang Wu
Mohamed S. Abdelfattah
Diana Marculescu
MQ
28
0
0
03 Dec 2025
Fairy2i: Training Complex LLMs from Real LLMs with All Parameters in $\{\pm 1, \pm i\}$
Fairy2i: Training Complex LLMs from Real LLMs with All Parameters in {±1,±i}\{\pm 1, \pm i\}{±1,±i}
Feiyu Wang
Xinyu Tan
Bokai Huang
Yihao Zhang
Guoan Wang
Peizhuang Cong
Tong Yang
MQ
172
0
0
02 Dec 2025
Q-BERT4Rec: Quantized Semantic-ID Representation Learning for Multimodal Recommendation
Q-BERT4Rec: Quantized Semantic-ID Representation Learning for Multimodal Recommendation
Haofeng Huang
Ling Gai
MQVLM
132
0
0
02 Dec 2025
Intrinsic Structure as a Proxy for Saliency: SVD-Based Weight Preservation for Mixed-Precision Quantization in Large Language Models
Intrinsic Structure as a Proxy for Saliency: SVD-Based Weight Preservation for Mixed-Precision Quantization in Large Language Models
Shashank Landge
Abhishek Patil
Tejas kamble
Bhushan Buddhivant
Priyanka Joshi
MQ
20
0
0
01 Dec 2025
KV Pareto: Systems-Level Optimization of KV Cache and Model Compression for Long Context Inference
Sai Gokhale
Devleena Das
Rajeev Patwari
Ashish Sirasao
Elliott Delaye
MQ
36
0
0
01 Dec 2025
LPCD: Unified Framework from Layer-Wise to Submodule Quantization
LPCD: Unified Framework from Layer-Wise to Submodule Quantization
Yuma Ichikawa
Yudai Fujimoto
Akira Sakai
MQ
12
0
0
01 Dec 2025
Four Over Six: More Accurate NVFP4 Quantization with Adaptive Block Scaling
Four Over Six: More Accurate NVFP4 Quantization with Adaptive Block Scaling
Jack Cook
Junxian Guo
Guangxuan Xiao
Yujun Lin
Song Han
MQ
140
0
0
01 Dec 2025
HBLLM: A Haar-Based Approach for Accurate Structured 1-Bit Quantized LLMs
HBLLM: A Haar-Based Approach for Accurate Structured 1-Bit Quantized LLMs
Ningning Chen
Weicai Ye
Ying Jiang
MQ
100
0
0
30 Nov 2025
WUSH: Near-Optimal Adaptive Transforms for LLM Quantization
Jiale Chen
Vage Egiazarian
Torsten Hoefler
Dan Alistarh
MQ
12
0
0
30 Nov 2025
Realistic Handwritten Multi-Digit Writer (MDW) Number Recognition Challenges
Realistic Handwritten Multi-Digit Writer (MDW) Number Recognition Challenges
Kiri L. Wagstaff
MQ
12
0
0
30 Nov 2025
TWEO: Transformers Without Extreme Outliers Enables FP8 Training And Quantization For Dummies
TWEO: Transformers Without Extreme Outliers Enables FP8 Training And Quantization For Dummies
Guang Liang
Jie Shao
Ningyuan Tang
Xinyao Liu
Jianxin Wu
MQ
40
0
0
28 Nov 2025
Enhancing Trustworthiness with Mixed Precision: Benchmarks, Opportunities, and Challenges
Enhancing Trustworthiness with Mixed Precision: Benchmarks, Opportunities, and Challenges
Guanxi Lu
Hao Mark Chen
Zhiqiang Que
Wayne Luk
Hongxiang Fan
MQ
12
0
0
27 Nov 2025
SingleQuant: Efficient Quantization of Large Language Models in a Single Pass
SingleQuant: Efficient Quantization of Large Language Models in a Single Pass
Jinying Xiao
Bin Ji
Shasha Li
Xiaodong Liu
Ma Jun
Ye Zhong
Wei Li
Xuan Xie
Qingbo Wu
Jie Yu
MQ
20
0
0
27 Nov 2025
IntAttention: A Fully Integer Attention Pipeline for Efficient Edge Inference
IntAttention: A Fully Integer Attention Pipeline for Efficient Edge Inference
Wanli Zhong
Haibo Feng
Zirui Zhou
Hanyang Peng
Shiqi Yu
MQ
74
0
0
26 Nov 2025
G-Net: A Provably Easy Construction of High-Accuracy Random Binary Neural Networks
G-Net: A Provably Easy Construction of High-Accuracy Random Binary Neural Networks
Alireza Aghasi
Nicholas F. Marshall
Saeid Pourmand
Wyatt D. Whiting
MQ
164
0
0
26 Nov 2025
QuantKAN: A Unified Quantization Framework for Kolmogorov Arnold Networks
QuantKAN: A Unified Quantization Framework for Kolmogorov Arnold Networks
Kazi Ahmed Asif Fuad
Lizhong Chen
MQ
73
0
0
24 Nov 2025
CafeQ: Calibration-free Quantization via Learned Transformations and Adaptive Rounding
CafeQ: Calibration-free Quantization via Learned Transformations and Adaptive Rounding
Ziteng Sun
Adrian Benton
Samuel Kushnir
Asher Trockman
Vikas Singh
Suhas Diggavi
A. Suresh
MQ
29
0
0
24 Nov 2025
Adaptive Mesh-Quantization for Neural PDE Solvers
Adaptive Mesh-Quantization for Neural PDE Solvers
Winfried van den Dool
Maksim Zhdanov
Yuki M. Asano
Max Welling
MQAI4CE
253
0
0
23 Nov 2025
RFX: High-Performance Random Forests with GPU Acceleration and QLORA Compression
RFX: High-Performance Random Forests with GPU Acceleration and QLORA Compression
Chris Kuchar
MQ
128
0
0
23 Nov 2025
Kitty: Accurate and Efficient 2-bit KV Cache Quantization with Dynamic Channel-wise Precision Boost
Kitty: Accurate and Efficient 2-bit KV Cache Quantization with Dynamic Channel-wise Precision Boost
Haojun Xia
Xiaoxia Wu
Jisen Li
Robert Wu
Junxiong Wang
...
Donglin Zhuang
Zhongzhu Zhou
Ben Athiwaratkun
Zhen Zheng
Shuaiwen Leon Song
MQ
40
0
0
23 Nov 2025
A Systematic Study of Compression Ordering for Large Language Models
A Systematic Study of Compression Ordering for Large Language Models
Shivansh Chhawri
Rahul Mahadik
Suparna Rooj
MQ
36
0
0
23 Nov 2025
Adaptive Layer-Wise Transformations for Post-Training Quantization of Large Language Models
Adaptive Layer-Wise Transformations for Post-Training Quantization of Large Language Models
Cuong Pham
Hoang Anh Dung
Cuong C. Nguyen
Trung Le
G. Carneiro
Jianfei Cai
Thanh-Toan Do
MQ
38
0
0
21 Nov 2025
Layer-Wise High-Impact Parameter Ratio Optimization in Post-Training Quantization for Large Language Models
Layer-Wise High-Impact Parameter Ratio Optimization in Post-Training Quantization for Large Language Models
Cuong Pham
Hoang Anh Dung
Cuong C. Nguyen
Trung Le
G. Carneiro
Thanh-Toan Do
MQ
33
0
0
21 Nov 2025
R2Q: Towards Robust 2-Bit Large Language Models via Residual Refinement Quantization
R2Q: Towards Robust 2-Bit Large Language Models via Residual Refinement Quantization
Jiayi Chen
Jieqi Shi
Jing Huo
Chen Wu
MQ
58
0
0
21 Nov 2025
A Multi-Stage Optimization Framework for Deploying Learned Image Compression on FPGAs
A Multi-Stage Optimization Framework for Deploying Learned Image Compression on FPGAs
Jiaxun Fang
Li Chen
MQ
144
0
0
21 Nov 2025
Layer-wise Weight Selection for Power-Efficient Neural Network Acceleration
Layer-wise Weight Selection for Power-Efficient Neural Network Acceleration
Jiaxun Fang
Grace Li Zhang
Shaoyi Huang
MQ
179
0
0
21 Nov 2025
PocketLLM: Ultimate Compression of Large Language Models via Meta Networks
PocketLLM: Ultimate Compression of Large Language Models via Meta Networks
Ye Tian
Chengcheng Wang
Jing Han
Yehui Tang
Kai Han
MQ
32
0
0
19 Nov 2025
D4C: Data-free Quantization for Contrastive Language-Image Pre-training Models
D4C: Data-free Quantization for Contrastive Language-Image Pre-training Models
Wenlun Zhang
Yunshan Zhong
Zihao Ding
Xinyu Li
Kentaro Yoshioka
MQCLIPVLM
83
0
0
19 Nov 2025
Quant-Trim in Practice: Improved Cross-Platform Low-Bit Deployment on Edge NPUs
Quant-Trim in Practice: Improved Cross-Platform Low-Bit Deployment on Edge NPUs
Rayen Dhahri
Steffen Urban
MQ
106
0
0
19 Nov 2025
Multi-Aspect Cross-modal Quantization for Generative Recommendation
Multi-Aspect Cross-modal Quantization for Generative Recommendation
Fuwei Zhang
Xiaoyu Liu
Dongbo Xi
Jishen Yin
Huan Chen
Peng Yan
Fuzhen Zhuang
Zhao Zhang
MQ
149
0
0
19 Nov 2025
IPTQ-ViT: Post-Training Quantization of Non-linear Functions for Integer-only Vision Transformers
IPTQ-ViT: Post-Training Quantization of Non-linear Functions for Integer-only Vision Transformers
Gihwan Kim
Jemin Lee
Hyungshin Kim
MQ
44
0
0
19 Nov 2025
The Impact of Quantization on Large Reasoning Model Reinforcement Learning
The Impact of Quantization on Large Reasoning Model Reinforcement Learning
M S Chaitanya Kumar
Zifei Xu
Xin Wang
T. Webb
OffRLMQLRM
229
0
0
19 Nov 2025
BD-Net: Has Depth-Wise Convolution Ever Been Applied in Binary Neural Networks?
BD-Net: Has Depth-Wise Convolution Ever Been Applied in Binary Neural Networks?
DoYoung Kim
Jin-Seop Lee
Noo-Ri Kim
SungJoon Lee
Jee-Hyong Lee
MQ
56
3
0
19 Nov 2025
Dynamic Expert Quantization for Scalable Mixture-of-Experts Inference
Dynamic Expert Quantization for Scalable Mixture-of-Experts Inference
Kexin Chu
Dawei Xiang
Zixu Shen
Yiwei Yang
Zecheng Liu
Wei Zhang
MoEMQ
223
0
0
19 Nov 2025
TokenSqueeze: Performance-Preserving Compression for Reasoning LLMs
TokenSqueeze: Performance-Preserving Compression for Reasoning LLMs
Yuxiang Zhang
Zhengxu Yu
Weihang Pan
Zhongming Jin
Qiang Fu
Deng Cai
Binbin Lin
Jieping Ye
OffRLMQLRM
140
0
0
17 Nov 2025
SLMQuant:Benchmarking Small Language Model Quantization for Practical Deployment
SLMQuant:Benchmarking Small Language Model Quantization for Practical Deployment
Jiacheng Wang
Yejun Zeng
Jinyang Guo
Yuqing Ma
Aishan Liu
Xianglong Liu
MQ
169
1
0
17 Nov 2025
MCAQ-YOLO: Morphological Complexity-Aware Quantization for Efficient Object Detection with Curriculum Learning
MCAQ-YOLO: Morphological Complexity-Aware Quantization for Efficient Object Detection with Curriculum Learning
Yoonjae Seo
Ermal Elbasani
Jaehong Lee
MQ
76
0
0
17 Nov 2025
OTARo: Once Tuning for All Precisions toward Robust On-Device LLMs
OTARo: Once Tuning for All Precisions toward Robust On-Device LLMs
Shaoyuan Chen
Zhixuan Chen
Dawei Yang
Zhihang Yuan
Qiang Wu
MQ
128
0
0
17 Nov 2025
Dimension vs. Precision: A Comparative Analysis of Autoencoders and Quantization for Efficient Vector Retrieval on BEIR SciFact
Dimension vs. Precision: A Comparative Analysis of Autoencoders and Quantization for Efficient Vector Retrieval on BEIR SciFact
Satyanarayan Pati
MQ
83
0
0
17 Nov 2025
Diffusion Model Based Signal Recovery Under 1-Bit Quantization
Diffusion Model Based Signal Recovery Under 1-Bit Quantization
Youming Chen
Zhaoqiang Liu
DiffMMQ
154
0
0
16 Nov 2025
FERMI-ML: A Flexible and Resource-Efficient Memory-In-Situ SRAM Macro for TinyML acceleration
FERMI-ML: A Flexible and Resource-Efficient Memory-In-Situ SRAM Macro for TinyML acceleration
Mukul Lokhande
Akash Sankhe
S. V. Jaya Chand
Santosh Kumar Vishvakarma
MQ
37
0
0
16 Nov 2025
Enhancing Machine Learning Model Efficiency through Quantization and Bit Depth Optimization: A Performance Analysis on Healthcare Data
Enhancing Machine Learning Model Efficiency through Quantization and Bit Depth Optimization: A Performance Analysis on Healthcare Data
Mitul Goswami
Romit Chatterjee
MQ
38
0
0
16 Nov 2025
BitSnap: Checkpoint Sparsification and Quantization in LLM Training
BitSnap: Checkpoint Sparsification and Quantization in LLM Training
Yanxin Peng
Qingping Li
Baodong Wu
Shigang Li
Guohao Dai
Shengen Yan
Yu Wang
MQ
157
0
0
15 Nov 2025
Point Cloud Quantization through Multimodal Prompting for 3D Understanding
Point Cloud Quantization through Multimodal Prompting for 3D Understanding
Hongxuan Li
Wencheng Zhu
Huiying Xu
Xinzhong Zhu
Pengfei Zhu
MQ3DPC
269
0
0
15 Nov 2025
Cmprsr: Abstractive Token-Level Question-Agnostic Prompt Compressor
Cmprsr: Abstractive Token-Level Question-Agnostic Prompt Compressor
Ivan Zakazov
Alexander Sharipov
Berke Argin
Oussama Gabouj
Kamel Charaf
Alexi Semiz
Lorenzo Drudi
Nicolas Mario Baldwin
Robert West
MQ
44
0
0
15 Nov 2025
Low-Bit, High-Fidelity: Optimal Transport Quantization for Flow Matching
Low-Bit, High-Fidelity: Optimal Transport Quantization for Flow Matching
Dara Varam
Diaa Addeen Abuhani
Imran Zualkernan
Raghad AlDamani
Lujain Khalil
MQOT
197
0
0
14 Nov 2025
Temporal-adaptive Weight Quantization for Spiking Neural Networks
Temporal-adaptive Weight Quantization for Spiking Neural Networks
Han Zhang
Qingyan Meng
Jiaqi Wang
Baiyu Chen
Zhengyu Ma
Xiaopeng Fan
MQ
65
0
0
14 Nov 2025
Loading #Papers per Month with "MQ"
Past speakers
Name (-)
Top Contributors
Name (-)
Top Organizations at ResearchTrend.AI
Name (-)
Social Events
DateLocationEvent
No social events available