Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2302.13214
Cited By
Fast Attention Requires Bounded Entries
26 February 2023
Josh Alman
Zhao-quan Song
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Fast Attention Requires Bounded Entries"
16 / 16 papers shown
Title
Attention Condensation via Sparsity Induced Regularized Training
Eli Sason
Darya Frolova
Boris Nazarov
Felix Goldberd
75
0
0
03 Mar 2025
Looped ReLU MLPs May Be All You Need as Practical Programmable Computers
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao-quan Song
Yufa Zhou
84
17
0
21 Feb 2025
Low-Rank Thinning
Annabelle Michael Carrell
Albert Gong
Abhishek Shetty
Raaz Dwivedi
Lester W. Mackey
53
0
0
17 Feb 2025
Fast Gradient Computation for RoPE Attention in Almost Linear Time
Yifang Chen
Jiayan Huo
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
48
11
0
03 Jan 2025
Part-Whole Relational Fusion Towards Multi-Modal Scene Understanding
Yi Liu
Chengxin Li
Shoukun Xu
J. Han
ViT
27
1
0
19 Oct 2024
HSR-Enhanced Sparse Attention Acceleration
Bo Chen
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao-quan Song
75
17
0
14 Oct 2024
Differentially Private Kernel Density Estimation
Erzhi Liu
Jerry Yao-Chieh Hu
Alex Reneau
Zhao Song
Han Liu
47
3
0
03 Sep 2024
When big data actually are low-rank, or entrywise approximation of certain function-generated matrices
Stanislav Budzinskiy
49
2
0
03 Jul 2024
Outlier-Efficient Hopfield Layers for Large Transformer-Based Models
Jerry Yao-Chieh Hu
Pei-Hsuan Chang
Haozheng Luo
Hong-Yu Chen
Weijian Li
Wei-Po Wang
Han Liu
31
25
0
04 Apr 2024
Uniform Memory Retrieval with Larger Capacity for Modern Hopfield Models
Dennis Wu
Jerry Yao-Chieh Hu
Teng-Yun Hsiao
Han Liu
38
28
0
04 Apr 2024
Fast Heavy Inner Product Identification Between Weights and Inputs in Neural Network Training
Lianke Qin
Saayan Mitra
Zhao-quan Song
Yuanyuan Yang
Tianyi Zhou
21
0
0
19 Nov 2023
How to Protect Copyright Data in Optimization of Large Language Models?
T. Chu
Zhao-quan Song
Chiwun Yang
28
29
0
23 Aug 2023
Efficient SGD Neural Network Training via Sublinear Activated Neuron Identification
Lianke Qin
Zhao-quan Song
Yuanyuan Yang
20
9
0
13 Jul 2023
Differentially Private Attention Computation
Yeqi Gao
Zhao-quan Song
Xin Yang
42
19
0
08 May 2023
Bypass Exponential Time Preprocessing: Fast Neural Network Training via Weight-Data Correlation Preprocessing
Josh Alman
Jiehao Liang
Zhao-quan Song
Ruizhe Zhang
Danyang Zhuo
59
32
0
25 Nov 2022
On The Computational Complexity of Self-Attention
Feyza Duman Keles
Pruthuvi Maheshakya Wijewardena
C. Hegde
50
107
0
11 Sep 2022
1