Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.04497
Cited By
The Fine-Grained Complexity of Gradient Computation for Training Large Language Models
7 February 2024
Josh Alman
Zhao-quan Song
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Fine-Grained Complexity of Gradient Computation for Training Large Language Models"
5 / 5 papers shown
Title
HSR-Enhanced Sparse Attention Acceleration
Bo Chen
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao-quan Song
82
18
0
14 Oct 2024
Outlier-Efficient Hopfield Layers for Large Transformer-Based Models
Jerry Yao-Chieh Hu
Pei-Hsuan Chang
Haozheng Luo
Hong-Yu Chen
Weijian Li
Wei-Po Wang
Han Liu
31
25
0
04 Apr 2024
Uniform Memory Retrieval with Larger Capacity for Modern Hopfield Models
Dennis Wu
Jerry Yao-Chieh Hu
Teng-Yun Hsiao
Han Liu
40
28
0
04 Apr 2024
Dynamic Tensor Product Regression
Aravind Reddy
Zhao-quan Song
Licheng Zhang
37
20
0
08 Oct 2022
On The Computational Complexity of Self-Attention
Feyza Duman Keles
Pruthuvi Maheshakya Wijewardena
C. Hegde
63
107
0
11 Sep 2022
1