All Papers
0 / 0 papers shown
Title |
|---|
Title |
|---|

Title |
|---|
![]() LevAttention: Time, Space, and Streaming Efficient Algorithm for Heavy
AttentionsInternational Conference on Learning Representations (ICLR), 2024 |
![]() How to Capture Higher-order Correlations? Generalizing Matrix Softmax
Attention to Kronecker ComputationInternational Conference on Learning Representations (ICLR), 2023 |
![]() HO: Heavy-Hitter Oracle for Efficient Generative Inference of Large
Language ModelsNeural Information Processing Systems (NeurIPS), 2023 |
![]() InfoPrompt: Information-Theoretic Soft Prompt Tuning for Natural
Language UnderstandingNeural Information Processing Systems (NeurIPS), 2023 |
![]() Fast Attention Requires Bounded EntriesNeural Information Processing Systems (NeurIPS), 2023 |