Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2306.16601
Cited By
An Efficient Sparse Inference Software Accelerator for Transformer-based Language Models on CPUs
28 June 2023
Haihao Shen
Hengyu Meng
Bo Dong
Zhe Wang
Ofir Zafrir
Yi Ding
Yunqian Luo
Hanwen Chang
Qun Gao
Zi. Wang
Guy Boudoukh
Moshe Wasserblat
MoE
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (4 upvotes)
Github (2170★)
Papers citing
"An Efficient Sparse Inference Software Accelerator for Transformer-based Language Models on CPUs"
1 / 1 papers shown
Title
Model Compression and Efficient Inference for Large Language Models: A Survey
Wenxiao Wang
Wei Chen
Yicong Luo
Yongliu Long
Zhengkai Lin
Liye Zhang
Binbin Lin
Deng Cai
Xiaofei He
MQ
248
85
0
15 Feb 2024
1