ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.02352
  4. Cited By
ATP: Enabling Fast LLM Serving via Attention on Top Principal Keys

ATP: Enabling Fast LLM Serving via Attention on Top Principal Keys

1 March 2024
Yue Niu
Saurav Prakash
Salman Avestimehr
ArXivPDFHTML

Papers citing "ATP: Enabling Fast LLM Serving via Attention on Top Principal Keys"

2 / 2 papers shown
Title
3LegRace: Privacy-Preserving DNN Training over TEEs and GPUs
3LegRace: Privacy-Preserving DNN Training over TEEs and GPUs
Yue Niu
Ramy E. Ali
Salman Avestimehr
FedML
44
17
0
04 Oct 2021
Efficient Content-Based Sparse Attention with Routing Transformers
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
240
579
0
12 Mar 2020
1