Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.12077
Cited By
GoldFinch: High Performance RWKV/Transformer Hybrid with Linear Pre-Fill and Extreme KV-Cache Compression
16 July 2024
Daniel Goldstein
Fares Obeid
Eric Alcaide
Guangyu Song
Eugene Cheah
VLM
AI4TS
Re-assign community
ArXiv
PDF
HTML
Papers citing
"GoldFinch: High Performance RWKV/Transformer Hybrid with Linear Pre-Fill and Extreme KV-Cache Compression"
1 / 1 papers shown
Title
A Systematic Study of Cross-Layer KV Sharing for Efficient LLM Inference
You Wu
Haoyi Wu
Kewei Tu
24
3
0
18 Oct 2024
1