ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.07802
  4. Cited By
Kraken: Inherently Parallel Transformers For Efficient Multi-Device
  Inference
v1v2 (latest)

Kraken: Inherently Parallel Transformers For Efficient Multi-Device Inference

Neural Information Processing Systems (NeurIPS), 2024
14 August 2024
R. Prabhakar
Hengrui Zhang
D. Wentzlaff
ArXiv (abs)PDFHTML

Papers citing "Kraken: Inherently Parallel Transformers For Efficient Multi-Device Inference"

1 / 1 papers shown
Title
Fast Transformer Decoding: One Write-Head is All You Need
Fast Transformer Decoding: One Write-Head is All You Need
Noam M. Shazeer
476
616
0
06 Nov 2019
1