Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.02352
Cited By
ATP: Enabling Fast LLM Serving via Attention on Top Principal Keys
1 March 2024
Yue Niu
Saurav Prakash
Salman Avestimehr
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ATP: Enabling Fast LLM Serving via Attention on Top Principal Keys"
2 / 2 papers shown
Title
3LegRace: Privacy-Preserving DNN Training over TEEs and GPUs
Yue Niu
Ramy E. Ali
Salman Avestimehr
FedML
44
17
0
04 Oct 2021
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
238
579
0
12 Mar 2020
1