ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Communities
  3. ...

Neighbor communities

0 / 0 papers shown
Title
Top Contributors
Name# Papers# Citations
Social Events
DateLocationEvent
  1. Home
  2. Communities
  3. MQ

Model Quantization

MQ
More data

Model Quantization is a technique used to reduce the size and computational requirements of machine learning models by representing weights and activations with lower precision. This is particularly useful for deploying models on resource-constrained devices, such as mobile phones and embedded systems.

Neighbor communities

51015

Featured Papers

0 / 0 papers shown
Title

All papers

50 / 2,897 papers shown
Title
DynaQuant: Dynamic Mixed-Precision Quantization for Learned Image Compression
DynaQuant: Dynamic Mixed-Precision Quantization for Learned Image Compression
Youneng Bao
Yulong Cheng
Yiping Liu
Yichen Yang
Peng Qin
Mu Li
Yongsheng Liang
MQ
0
0
0
11 Nov 2025
Quantizing Whisper-small: How design choices affect ASR performance
Quantizing Whisper-small: How design choices affect ASR performance
Arthur Söhler
Julian Irigoyen
Andreas Søeborg Kirkedal
MQ
0
0
0
11 Nov 2025
Extreme Model Compression with Structured Sparsity at Low Precision
Extreme Model Compression with Structured Sparsity at Low Precision
Dan Liu
Nikita Dvornik
Xue Liu
MQ
16
0
0
11 Nov 2025
Preparation of Fractal-Inspired Computational Architectures for Advanced Large Language Model Analysis
Preparation of Fractal-Inspired Computational Architectures for Advanced Large Language Model Analysis
Yash Mittal
Dmitry Ignatov
Radu Timofte
MQAI4CE
4
0
0
10 Nov 2025
Learning Quantized Continuous Controllers for Integer Hardware
Learning Quantized Continuous Controllers for Integer Hardware
Fabian Kresse
Christoph H. Lampert
MQ
64
0
0
10 Nov 2025
MI-to-Mid Distilled Compression (M2M-DC): An Hybrid-Information-Guided-Block Pruning with Progressive Inner Slicing Approach to Model Compression
MI-to-Mid Distilled Compression (M2M-DC): An Hybrid-Information-Guided-Block Pruning with Progressive Inner Slicing Approach to Model Compression
Lionel Levine
Sajjad Ghiasvand
Haniyeh Ehsani Oskouie
Majid Sarrafzadeh
MQ
8
0
0
10 Nov 2025
Precision-Scalable Microscaling Datapaths with Optimized Reduction Tree for Efficient NPU Integration
Precision-Scalable Microscaling Datapaths with Optimized Reduction Tree for Efficient NPU Integration
Stef Cuyckens
Xiaoling Yi
Robin Geens
Joren Dumoulin
Martin Wiesner
Chao Fang
Marian Verhelst
MQ
0
0
0
09 Nov 2025
You Had One Job: Per-Task Quantization Using LLMs' Hidden Representations
You Had One Job: Per-Task Quantization Using LLMs' Hidden Representations
Amit LeVi
Raz Lapid
Rom Himelstein
Yaniv Nemcovsky
Ravid Shwartz Ziv
Avi Mendelson
MQ
4
0
0
09 Nov 2025
Training-Free Adaptive Quantization for Variable Rate Image Coding for Machines
Training-Free Adaptive Quantization for Variable Rate Image Coding for Machines
Yui Tatsumi
Ziyue Zeng
Hiroshi Watanabe
MQ
0
0
0
08 Nov 2025
GABFusion: Rethinking Feature Fusion for Low-Bit Quantization of Multi-Task Networks
GABFusion: Rethinking Feature Fusion for Low-Bit Quantization of Multi-Task Networks
Zhaoyang Wang
Dong Wang
MQ
0
0
0
08 Nov 2025
HarmoQ: Harmonized Post-Training Quantization for High-Fidelity Image
HarmoQ: Harmonized Post-Training Quantization for High-Fidelity Image
Hongjun Wang
Jiyuan Chen
Xuan Song
Yinqiang Zheng
MQ
4
0
0
08 Nov 2025
MOSS: Efficient and Accurate FP8 LLM Training with Microscaling and Automatic Scaling
MOSS: Efficient and Accurate FP8 LLM Training with Microscaling and Automatic Scaling
Yu Zhang
Hui-Ling Zhen
Mingxuan Yuan
Bei Yu
MQ
0
0
0
08 Nov 2025
Attention and Compression is all you need for Controllably Efficient Language Models
Attention and Compression is all you need for Controllably Efficient Language Models
Jatin Prakash
Aahlad Puli
Rajesh Ranganath
MQVLM
96
0
0
07 Nov 2025
DartQuant: Efficient Rotational Distribution Calibration for LLM Quantization
DartQuant: Efficient Rotational Distribution Calibration for LLM Quantization
Yuantian Shao
Yuanteng Chen
Peisong Wang
Jianlin Yu
Jing Lin
Yiwu Yao
Zhihui Wei
Jian Cheng
MQ
68
0
0
06 Nov 2025
Memory- and Latency-Constrained Inference of Large Language Models via Adaptive Split Computing
Memory- and Latency-Constrained Inference of Large Language Models via Adaptive Split Computing
Mingyu Sung
Vikas Palakonda
Suhwan Im
Sunghwan Moon
Il-Min Kim
Sangseok Yun
Jae-Mo Kang
MQ
112
0
0
06 Nov 2025
Comparative Study of CNN Architectures for Binary Classification of Horses and Motorcycles in the VOC 2008 Dataset
Comparative Study of CNN Architectures for Binary Classification of Horses and Motorcycles in the VOC 2008 Dataset
Muhammad Annas Shaikh
Hamza Zaman
Arbaz Asif
MQ
41
0
0
06 Nov 2025
FedSparQ: Adaptive Sparse Quantization with Error Feedback for Robust & Efficient Federated Learning
FedSparQ: Adaptive Sparse Quantization with Error Feedback for Robust & Efficient Federated Learning
Chaimaa Medjadji
Sadi Alawadi
Feras M. Awaysheh
Guilain Leduc
Sylvain Kubler
Yves Le Traon
FedMLMQ
96
0
0
05 Nov 2025
A Quantized VAE-MLP Botnet Detection Model: A Systematic Evaluation of Quantization-Aware Training and Post-Training Quantization Strategies
A Quantized VAE-MLP Botnet Detection Model: A Systematic Evaluation of Quantization-Aware Training and Post-Training Quantization Strategies
Hassan Wasswa
Hussein Abbass
Timothy Lynar
MQ
97
0
0
05 Nov 2025
FP8-Flow-MoE: A Casting-Free FP8 Recipe without Double Quantization Error
FP8-Flow-MoE: A Casting-Free FP8 Recipe without Double Quantization Error
Fengjuan Wang
Zhiyi Su
Xingzhu Hu
Cheng Wang
Mou Sun
MQ
24
0
0
04 Nov 2025
Opto-Electronic Convolutional Neural Network Design Via Direct Kernel Optimization
Opto-Electronic Convolutional Neural Network Design Via Direct Kernel Optimization
Ali Almuallem
Harshana Weligampola
Abhiram Gnanasambandam
Wei Xu
Dilshan Godaliyadda
Hamid R. Sheikh
Stanley H. Chan
Qi Guo
MQ
44
0
0
03 Nov 2025
Efficiently Training A Flat Neural Network Before It has been Quantizated
Efficiently Training A Flat Neural Network Before It has been Quantizated
Peng Xia
Junbiao Pang
Tianyang Cai
MQ
32
0
0
03 Nov 2025
Training with Fewer Bits: Unlocking Edge LLMs Training with Stochastic Rounding
Training with Fewer Bits: Unlocking Edge LLMs Training with Stochastic Rounding
Taowen Liu
Marta Andronic
Deniz Gündüz
George A. Constantinides
MQ
28
0
0
02 Nov 2025
Fibbinary-Based Compression and Quantization for Efficient Neural Radio Receivers
Fibbinary-Based Compression and Quantization for Efficient Neural Radio Receivers
Roberta Fiandaca
Manil Dev Gomony
MQ
73
0
0
01 Nov 2025
Outlier-Aware Post-Training Quantization for Image Super-Resolution
Outlier-Aware Post-Training Quantization for Image Super-Resolution
Hailing Wang
Jianglin Lu
Yitian Zhang
Y. Fu
MQ
24
0
0
01 Nov 2025
TetraJet-v2: Accurate NVFP4 Training for Large Language Models with Oscillation Suppression and Outlier Control
TetraJet-v2: Accurate NVFP4 Training for Large Language Models with Oscillation Suppression and Outlier Control
Yuxiang Chen
Xiaoming Xu
Pengle Zhang
Michael Beyer
Martin Rapp
Jun Zhu
Jianfei Chen
MQ
24
0
0
31 Oct 2025
LoRAQuant: Mixed-Precision Quantization of LoRA to Ultra-Low Bits
LoRAQuant: Mixed-Precision Quantization of LoRA to Ultra-Low Bits
Amir Reza Mirzaei
Yuqiao Wen
Yanshuai Cao
Lili Mou
MQ
207
0
0
30 Oct 2025
STaMP: Sequence Transformation and Mixed Precision for Low-Precision Activation Quantization
STaMP: Sequence Transformation and Mixed Precision for Low-Precision Activation Quantization
Marco Federici
Riccardo Del Chiaro
Boris van Breugel
Paul N. Whatmough
Markus Nagel
MQ
48
0
0
30 Oct 2025
INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats
INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats
Mengzhao Chen
Meng Wu
Hui Jin
Zhihang Yuan
Jing Liu
...
Jin Ma
Zeyue Xue
Zhiheng Liu
Xingyan Bin
Ping Luo
MQ
126
0
0
29 Oct 2025
FALQON: Accelerating LoRA Fine-tuning with Low-Bit Floating-Point Arithmetic
FALQON: Accelerating LoRA Fine-tuning with Low-Bit Floating-Point Arithmetic
Kanghyun Choi
Hyeyoon Lee
S. Park
Dain Kwon
Jinho Lee
MQ
60
0
0
28 Oct 2025
BitSkip: An Empirical Analysis of Quantization and Early Exit Composition
BitSkip: An Empirical Analysis of Quantization and Early Exit Composition
Ramshankar Bhuvaneswaran
Handan Liu
MQ
136
0
0
27 Oct 2025
Efficient Low Rank Attention for Long-Context Inference in Large Language Models
Efficient Low Rank Attention for Long-Context Inference in Large Language Models
Tenghui Li
Guoxu Zhou
Xuyang Zhao
Y. Qiu
Qibin Zhao
MQRALM
104
0
0
25 Oct 2025
Beyond Isotonization: Scalable Non-Crossing Quantile Estimation via Neural Networks for Student Growth Percentiles
Beyond Isotonization: Scalable Non-Crossing Quantile Estimation via Neural Networks for Student Growth Percentiles
Kaihua Chang
MQ
67
0
0
25 Oct 2025
Performance Trade-offs of Optimizing Small Language Models for E-Commerce
Performance Trade-offs of Optimizing Small Language Models for E-Commerce
Josip Tomo Licardo
Nikola Tankovic
MQ
68
0
0
24 Oct 2025
A Convergence Analysis of Adaptive Optimizers under Floating-point Quantization
A Convergence Analysis of Adaptive Optimizers under Floating-point Quantization
Xuan Tang
Jichu Li
Difan Zou
MQ
100
0
0
24 Oct 2025
Learning Grouped Lattice Vector Quantizers for Low-Bit LLM Compression
Learning Grouped Lattice Vector Quantizers for Low-Bit LLM Compression
Xi Zhang
Xiaolin Wu
Jiamang Wang
W. Lin
MQ
56
0
0
23 Oct 2025
TernaryCLIP: Efficiently Compressing Vision-Language Models with Ternary Weights and Distilled Knowledge
TernaryCLIP: Efficiently Compressing Vision-Language Models with Ternary Weights and Distilled Knowledge
Shu-Hao Zhang
Wei Tang
Chen Wu
Peng Hu
Nan Li
L. Zhang
Qi Zhang
Shao-Qun Zhang
MQVLM
130
0
0
23 Oct 2025
AccuQuant: Simulating Multiple Denoising Steps for Quantizing Diffusion Models
AccuQuant: Simulating Multiple Denoising Steps for Quantizing Diffusion Models
Seunghoon Lee
Jeongwoo Choi
Byunggwan Son
Jaehyeon Moon
Jeimin Jeon
Bumsub Ham
DiffMMQ
103
0
0
23 Oct 2025
Efficient Multi-bit Quantization Network Training via Weight Bias Correction and Bit-wise Coreset Sampling
Efficient Multi-bit Quantization Network Training via Weight Bias Correction and Bit-wise Coreset Sampling
Jinhee Kim
Jae Jun An
Kang Eun Jeon
Jong Hwan Ko
MQ
68
0
0
23 Oct 2025
Adaptive Distribution-aware Quantization for Mixed-Precision Neural Networks
Adaptive Distribution-aware Quantization for Mixed-Precision Neural Networks
Shaohang Jia
Zhiyong Huang
Zhi Yu
Mingyang Hou
Shuai Miao
Han Yang
MQ
48
0
0
22 Oct 2025
Energy-Efficient and Dequantization-Free Q-LLMs: A Spiking Neural Network Approach to Salient Value Mitigation
Energy-Efficient and Dequantization-Free Q-LLMs: A Spiking Neural Network Approach to Salient Value Mitigation
Chenyu Wang
Zhanglu Yan
Zhi Zhou
Xu Chen
Weng-Fai Wong
MQ
60
0
0
22 Oct 2025
ELUTQ: Efficient LUT-Aware Quantization for Deploying Large Language Models on Edge Devices
ELUTQ: Efficient LUT-Aware Quantization for Deploying Large Language Models on Edge Devices
Xin Nie
Liang Dong
H. Zhang
JiaWang Xiao
G. Sun
MQ
116
0
0
22 Oct 2025
CAGE: Curvature-Aware Gradient Estimation For Accurate Quantization-Aware Training
CAGE: Curvature-Aware Gradient Estimation For Accurate Quantization-Aware Training
Soroush Tabesh
M. Safaryan
Dan Alistarh
Alexandra Volkova
Dan Alistarh
MQ
60
0
0
21 Oct 2025
Binary Quadratic Quantization: Beyond First-Order Quantization for Real-Valued Matrix Compression
Binary Quadratic Quantization: Beyond First-Order Quantization for Real-Valued Matrix Compression
Kyo Kuroki
Yasuyuki Okoshi
Thiem Van Chu
Kazushi Kawamura
Masato Motomura
MQ
68
0
0
21 Oct 2025
Learning under Quantization for High-Dimensional Linear Regression
Learning under Quantization for High-Dimensional Linear Regression
Dechen Zhang
Junwei Su
Difan Zou
MQ
69
0
0
21 Oct 2025
Bitwidth-Specific Logarithmic Arithmetic for Future Hardware-Accelerated Training
Bitwidth-Specific Logarithmic Arithmetic for Future Hardware-Accelerated Training
Hassan Hamad
Yuou Qiu
Peter A. Beerel
K. Chugg
MQ
52
0
0
20 Oct 2025
QueST: Incentivizing LLMs to Generate Difficult Problems
QueST: Incentivizing LLMs to Generate Difficult Problems
Hanxu Hu
Xingxing Zhang
Jannis Vamvas
Rico Sennrich
Furu Wei
AIMatSyDaMQLRM
107
0
0
20 Oct 2025
Mixed-Precision Quantization for Language Models: Techniques and Prospects
Mixed-Precision Quantization for Language Models: Techniques and Prospects
M. Rakka
Marios Fournarakis
Olga Krestinskaya
Jinane Bazzi
K. Salama
Fadi J. Kurdahi
A. Eltawil
M. Fouda
MQ
64
0
0
19 Oct 2025
VisionSelector: End-to-End Learnable Visual Token Compression for Efficient Multimodal LLMs
VisionSelector: End-to-End Learnable Visual Token Compression for Efficient Multimodal LLMs
Jiaying Zhu
Yurui Zhu
Xin Lu
Wenrui Yan
Dong Li
Kunlin Liu
Xueyang Fu
Zheng-Jun Zha
MQVLM
103
0
0
18 Oct 2025
Differentiable, Bit-shifting, and Scalable Quantization without training neural network from scratch
Differentiable, Bit-shifting, and Scalable Quantization without training neural network from scratch
Zia Badar
MQ
62
0
0
18 Oct 2025
Optimization of the quantization of dense neural networks from an exact QUBO formulation
Optimization of the quantization of dense neural networks from an exact QUBO formulation
Sergio Muñiz Subiñas
Manuel L. González
Jorge Ruiz Gómez
Alejandro Mata Ali
Jorge Martínez Martín
Miguel Franco Hernando
Ángel Miguel García-Vico
MQ
64
0
0
17 Oct 2025
Loading #Papers per Month with "MQ"
Past speakers
Name (-)
Top Contributors
Name (-)
Top Organizations at ResearchTrend.AI
Name (-)
Social Events
DateLocationEvent
No social events available