Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2406.07553
Cited By
Inference Acceleration for Large Language Models on CPUs
4 March 2024
PS Ditto
VG Jithin
MS Adarsh
Re-assign community
ArXiv (abs)
PDF
HTML
Github
Papers citing
"Inference Acceleration for Large Language Models on CPUs"
1 / 1 papers shown
Fast Transformer Decoding: One Write-Head is All You Need
Noam M. Shazeer
830
731
0
06 Nov 2019
1
Page 1 of 1