Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.17985
Cited By
FlattenQuant: Breaking Through the Inference Compute-bound for Large Language Models with Per-tensor Quantization
28 February 2024
Yi Zhang
Fei Yang
Shuang Peng
Fangyu Wang
Aimin Pan
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"FlattenQuant: Breaking Through the Inference Compute-bound for Large Language Models with Per-tensor Quantization"
Title
No papers