Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2408.07802
Cited By
v1
v2 (latest)
Kraken: Inherently Parallel Transformers For Efficient Multi-Device Inference
Neural Information Processing Systems (NeurIPS), 2024
14 August 2024
R. Prabhakar
Hengrui Zhang
D. Wentzlaff
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Kraken: Inherently Parallel Transformers For Efficient Multi-Device Inference"
1 / 1 papers shown
Title
Fast Transformer Decoding: One Write-Head is All You Need
Noam M. Shazeer
476
616
0
06 Nov 2019
1