Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2112.07628
Cited By
Training Multi-Layer Over-Parametrized Neural Network in Subquadratic Time
14 December 2021
Zhao-quan Song
Licheng Zhang
Ruizhe Zhang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Training Multi-Layer Over-Parametrized Neural Network in Subquadratic Time"
15 / 15 papers shown
Title
Fast Gradient Computation for RoPE Attention in Almost Linear Time
Yifang Chen
Jiayan Huo
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
57
11
0
03 Jan 2025
HSR-Enhanced Sparse Attention Acceleration
Bo Chen
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao-quan Song
84
18
0
14 Oct 2024
Quantum Speedup for Spectral Approximation of Kronecker Products
Yeqi Gao
Zhao-quan Song
Ruizhe Zhang
13
3
0
10 Feb 2024
Fast Heavy Inner Product Identification Between Weights and Inputs in Neural Network Training
Lianke Qin
Saayan Mitra
Zhao-quan Song
Yuanyuan Yang
Tianyi Zhou
27
0
0
19 Nov 2023
ReLU Strikes Back: Exploiting Activation Sparsity in Large Language Models
Iman Mirzadeh
Keivan Alizadeh-Vahid
Sachin Mehta
C. C. D. Mundo
Oncel Tuzel
Golnoosh Samei
Mohammad Rastegari
Mehrdad Farajtabar
118
58
0
06 Oct 2023
How to Protect Copyright Data in Optimization of Large Language Models?
T. Chu
Zhao-quan Song
Chiwun Yang
28
29
0
23 Aug 2023
Efficient SGD Neural Network Training via Sublinear Activated Neuron Identification
Lianke Qin
Zhao-quan Song
Yuanyuan Yang
20
9
0
13 Jul 2023
Efficient Alternating Minimization with Applications to Weighted Low Rank Approximation
Zhao-quan Song
Mingquan Ye
Junze Yin
Licheng Zhang
21
7
0
07 Jun 2023
Fast and Efficient Matching Algorithm with Deadline Instances
Zhao-quan Song
Weixin Wang
Chenbo Yin
Junze Yin
8
7
0
15 May 2023
A General Algorithm for Solving Rank-one Matrix Sensing
Lianke Qin
Zhao-quan Song
Ruizhe Zhang
13
15
0
22 Mar 2023
A Nearly-Optimal Bound for Fast Regression with
ℓ
∞
\ell_\infty
ℓ
∞
Guarantee
Zhao-quan Song
Mingquan Ye
Junze Yin
Licheng Zhang
14
10
0
01 Feb 2023
Bypass Exponential Time Preprocessing: Fast Neural Network Training via Weight-Data Correlation Preprocessing
Josh Alman
Jiehao Liang
Zhao-quan Song
Ruizhe Zhang
Danyang Zhuo
69
32
0
25 Nov 2022
Sketching for First Order Method: Efficient Algorithm for Low-Bandwidth Channel and Vulnerability
Zhao-quan Song
Yitan Wang
Zheng Yu
Licheng Zhang
FedML
18
28
0
15 Oct 2022
Fast Graph Neural Tangent Kernel via Kronecker Sketching
Shunhua Jiang
Yunze Man
Zhao-quan Song
Zheng Yu
Danyang Zhuo
16
5
0
04 Dec 2021
Newton-LESS: Sparsification without Trade-offs for the Sketched Newton Update
Michal Derezinski
Jonathan Lacotte
Mert Pilanci
Michael W. Mahoney
21
26
0
15 Jul 2021
1