Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2503.08040
Cited By
v1
v2
v3 (latest)
Accurate INT8 Training Through Dynamic Block-Level Fallback
11 March 2025
Pengle Zhang
Jia Wei
Jintao Zhang
Jun-Jie Zhu
Jianfei Chen
MQ
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Github
Papers citing
"Accurate INT8 Training Through Dynamic Block-Level Fallback"
5 / 5 papers shown
PAROAttention: Pattern-Aware ReOrdering for Efficient Sparse and Quantized Attention in Visual Generation Models
Tianchen Zhao
Ke Hong
Xinhao Yang
Xuefeng Xiao
Huixia Li
...
Ruiqi Xie
Siqi Chen
Hongyu Zhu
Xicheng Zhang
Yu Wang
MQ
VGen
294
11
0
19 Jun 2025
SageAttention2++: A More Efficient Implementation of SageAttention2
Jintao Zhang
Xiaoming Xu
Jia Wei
Haofeng Huang
Pengle Zhang
Chendong Xiang
Jun Zhu
Jianfei Chen
MQ
VLM
568
20
0
27 May 2025
Scaling Law for Quantization-Aware Training
Mengzhao Chen
Chaoyi Zhang
Jing Liu
Yutao Zeng
Zeyue Xue
...
Yunshui Li
Jin Ma
Jie Huang
Xun Zhou
Ping Luo
MQ
371
12
0
20 May 2025
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration
International Conference on Learning Representations (ICLR), 2024
Jintao Zhang
Jia Wei
Pengle Zhang
Jun-Jie Zhu
Jun Zhu
Jianfei Chen
VLM
MQ
762
120
0
03 Oct 2024
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving
Chengyue Wu
Haotian Tang
Shang Yang
Zhekai Zhang
Guangxuan Xiao
Chuang Gan
Song Han
788
182
0
07 May 2024
1
Page 1 of 1