All Papers
0 / 0 papers shown
Title |
|---|
Title |
|---|

Title |
|---|
![]() Blockwise Compression of Transformer-based Models without RetrainingNeural Networks (Neural Netw.), 2023 |
![]() FlexGen: High-Throughput Generative Inference of Large Language Models
with a Single GPUInternational Conference on Machine Learning (ICML), 2023 |
![]() A Comprehensive Review and a Taxonomy of Edge Machine Learning:
Requirements, Paradigms, and TechniquesApplied Informatics (AI), 2023 |
![]() Broken Neural Scaling LawsInternational Conference on Learning Representations (ICLR), 2022 |
![]() CAP: Correlation-Aware Pruning for Highly-Accurate Sparse Vision ModelsNeural Information Processing Systems (NeurIPS), 2022 |