ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2209.06979
  4. Cited By
Efficient Quantized Sparse Matrix Operations on Tensor Cores

Efficient Quantized Sparse Matrix Operations on Tensor Cores

14 September 2022
Shigang Li
Kazuki Osawa
Torsten Hoefler
ArXivPDFHTML

Papers citing "Efficient Quantized Sparse Matrix Operations on Tensor Cores"

9 / 9 papers shown
Title
FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving
FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving
Zihao Ye
Lequn Chen
Ruihang Lai
Wuwei Lin
Yineng Zhang
...
Tianqi Chen
Baris Kasikci
Vinod Grover
Arvind Krishnamurthy
Luis Ceze
40
19
0
02 Jan 2025
Chameleon: a Heterogeneous and Disaggregated Accelerator System for Retrieval-Augmented Language Models
Chameleon: a Heterogeneous and Disaggregated Accelerator System for Retrieval-Augmented Language Models
Wenqi Jiang
Marco Zeller
R. Waleffe
Torsten Hoefler
Gustavo Alonso
33
14
0
15 Oct 2023
Chimera: Efficiently Training Large-Scale Neural Networks with
  Bidirectional Pipelines
Chimera: Efficiently Training Large-Scale Neural Networks with Bidirectional Pipelines
Shigang Li
Torsten Hoefler
GNN
AI4CE
LRM
71
94
0
14 Jul 2021
Carbon Emissions and Large Neural Network Training
Carbon Emissions and Large Neural Network Training
David A. Patterson
Joseph E. Gonzalez
Quoc V. Le
Chen Liang
Lluís-Miquel Munguía
D. Rothchild
David R. So
Maud Texier
J. Dean
AI4CE
233
626
0
21 Apr 2021
Sparsity in Deep Learning: Pruning and growth for efficient inference
  and training in neural networks
Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks
Torsten Hoefler
Dan Alistarh
Tal Ben-Nun
Nikoli Dryden
Alexandra Peste
MQ
128
679
0
31 Jan 2021
I-BERT: Integer-only BERT Quantization
I-BERT: Integer-only BERT Quantization
Sehoon Kim
A. Gholami
Z. Yao
Michael W. Mahoney
Kurt Keutzer
MQ
80
332
0
05 Jan 2021
Big Bird: Transformers for Longer Sequences
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
246
1,982
0
28 Jul 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
220
3,054
0
23 Jan 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
243
1,791
0
17 Sep 2019
1