Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2405.17849
Cited By
v1
v2 (latest)
I-LLM: Efficient Integer-Only Inference for Fully-Quantized Low-Bit Large Language Models
28 May 2024
Yan Chen
Yuan Cheng
Dawei Yang
Zhihang Yuan
Jiangyong Yu
Chen Xu
Sifan Zhou
MQ
Re-assign community
ArXiv (abs)
PDF
HTML
Github
Papers citing
"I-LLM: Efficient Integer-Only Inference for Fully-Quantized Low-Bit Large Language Models"
8 / 8 papers shown
IntAttention: A Fully Integer Attention Pipeline for Efficient Edge Inference
Wanli Zhong
Haibo Feng
Zirui Zhou
Hanyang Peng
Shiqi Yu
MQ
369
1
0
26 Nov 2025
IPTQ-ViT: Post-Training Quantization of Non-linear Functions for Integer-only Vision Transformers
Gihwan Kim
Jemin Lee
Hyungshin Kim
MQ
191
0
0
19 Nov 2025
RSAVQ: Riemannian Sensitivity-Aware Vector Quantization for Large Language Models
Zukang Xu
Yan Chen
Qiang Wu
Dawei Yang
MQ
263
0
0
24 Sep 2025
A Survey: Towards Privacy and Security in Mobile Large Language Models
Honghui Xu
Kaiyang Li
Wei Chen
Danyang Zheng
Zhiyuan Li
Zhipeng Cai
PILM
332
0
0
02 Sep 2025
KLLM: Fast LLM Inference with K-Means Quantization
Xueying Wu
Baijun Zhou
Zhihui Gao
Yuzhe Fu
Qilin Zheng
Yintao He
Hai Helen Li
MQ
347
0
0
30 Jul 2025
Latent Video Dataset Distillation
Ning Li
Antai Andy Liu
Jingran Zhang
Justin Cui
DD
VGen
594
2
0
23 Apr 2025
OstQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitting
International Conference on Learning Representations (ICLR), 2025
Yan Chen
Yuan Cheng
Dawei Yang
Zukang Xu
Zhihang Yuan
Jiangyong Yu
Chen Xu
Zhe Jiang
Sifan Zhou
MQ
350
62
0
23 Jan 2025
An empirical study of LLaMA3 quantization: from LLMs to MLLMs
Wei Huang
Xingyu Zheng
Xudong Ma
Haotong Qin
Chengtao Lv
Hong Chen
Jie Luo
Xiaojuan Qi
Xianglong Liu
Michele Magno
MQ
592
76
0
22 Apr 2024
1
Page 1 of 1