Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.16504
Cited By
An Over-parameterized Exponential Regression
29 March 2023
Yeqi Gao
Sridhar Mahadevan
Zhao-quan Song
Re-assign community
ArXiv
PDF
HTML
Papers citing
"An Over-parameterized Exponential Regression"
28 / 28 papers shown
Title
Theoretical Foundation of Flow-Based Time Series Generation: Provable Approximation, Generalization, and Efficiency
Jiangxuan Long
Zhao-quan Song
Chiwun Yang
AI4TS
78
0
0
18 Mar 2025
Theoretical Guarantees for High Order Trajectory Refinement in Generative Flows
Chengyue Gong
Xiaoyu Li
Yingyu Liang
Jiangxuan Long
Zhenmei Shi
Zhao-quan Song
Yu Tian
51
3
0
12 Mar 2025
Scaling Law Phenomena Across Regression Paradigms: Multiple and Kernel Approaches
Yifang Chen
Xuyang Guo
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
56
3
0
03 Mar 2025
Theoretical Constraints on the Expressive Power of
R
o
P
E
\mathsf{RoPE}
RoPE
-based Tensor Attention Transformers
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
Mingda Wan
54
8
0
23 Dec 2024
Bypassing the Exponential Dependency: Looped Transformers Efficiently Learn In-context by Multi-step Gradient Descent
Bo Chen
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
77
18
0
15 Oct 2024
Binary Hypothesis Testing for Softmax Models and Leverage Score Models
Yeqi Gao
Yuzhou Gu
Zhao-quan Song
30
0
0
09 May 2024
Transformers are uninterpretable with myopic methods: a case study with bounded Dyck grammars
Kaiyue Wen
Yuchen Li
Bing Liu
Andrej Risteski
11
21
0
03 Dec 2023
One Pass Streaming Algorithm for Super Long Token Attention Approximation in Sublinear Space
Raghav Addanki
Chenyang Li
Zhao-quan Song
Chiwun Yang
42
3
0
24 Nov 2023
The Expressibility of Polynomial based Attention Scheme
Zhao-quan Song
Guangyi Xu
Junze Yin
27
5
0
30 Oct 2023
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Zichang Liu
Jue Wang
Tri Dao
Tianyi Zhou
Binhang Yuan
...
Anshumali Shrivastava
Ce Zhang
Yuandong Tian
Christopher Ré
Beidi Chen
BDL
11
189
0
26 Oct 2023
An Automatic Learning Rate Schedule Algorithm for Achieving Faster Convergence and Steeper Descent
Zhao-quan Song
Chiwun Yang
19
9
0
17 Oct 2023
A Fast Optimization View: Reformulating Single Layer Attention in LLM Based on Tensor and SVM Trick, and Solving It in Matrix Multiplication Time
Yeqi Gao
Zhao-quan Song
Weixin Wang
Junze Yin
15
25
0
14 Sep 2023
Solving Attention Kernel Regression Problem via Pre-conditioner
Zhao-quan Song
Junze Yin
Licheng Zhang
28
9
0
28 Aug 2023
How to Protect Copyright Data in Optimization of Large Language Models?
T. Chu
Zhao-quan Song
Chiwun Yang
28
29
0
23 Aug 2023
Zero-th Order Algorithm for Softmax Attention Optimization
Yichuan Deng
Zhihang Li
Sridhar Mahadevan
Zhao-quan Song
27
13
0
17 Jul 2023
Fast Quantum Algorithm for Attention Computation
Yeqi Gao
Zhao-quan Song
Xin Yang
Ruizhe Zhang
LRM
23
19
0
16 Jul 2023
Efficient SGD Neural Network Training via Sublinear Activated Neuron Identification
Lianke Qin
Zhao-quan Song
Yuanyuan Yang
20
9
0
13 Jul 2023
InfoPrompt: Information-Theoretic Soft Prompt Tuning for Natural Language Understanding
Junda Wu
Tong Yu
Rui Wang
Zhao-quan Song
Ruiyi Zhang
Handong Zhao
Chaochao Lu
Shuai Li
Ricardo Henao
VLM
26
22
0
08 Jun 2023
A Mathematical Abstraction for Balancing the Trade-off Between Creativity and Reality in Large Language Models
Ritwik Sinha
Zhao-quan Song
Tianyi Zhou
11
23
0
04 Jun 2023
Efficient Asynchronize Stochastic Gradient Algorithm with Structured Data
Zhao-quan Song
Mingquan Ye
14
4
0
13 May 2023
Differentially Private Attention Computation
Yeqi Gao
Zhao-quan Song
Xin Yang
42
19
0
08 May 2023
An Iterative Algorithm for Rescaled Hyperbolic Functions Regression
Yeqi Gao
Zhao-quan Song
Junze Yin
23
33
0
01 May 2023
The Closeness of In-Context Learning and Weight Shifting for Softmax Regression
Shuai Li
Zhao-quan Song
Yu Xia
Tong Yu
Tianyi Zhou
28
36
0
26 Apr 2023
Attention Scheme Inspired Softmax Regression
Yichuan Deng
Zhihang Li
Zhao-quan Song
20
42
0
20 Apr 2023
Solving Tensor Low Cycle Rank Approximation
Yichuan Deng
Yeqi Gao
Zhao-quan Song
24
6
0
13 Apr 2023
How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding
Yuchen Li
Yuan-Fang Li
Andrej Risteski
107
61
0
07 Mar 2023
Bypass Exponential Time Preprocessing: Fast Neural Network Training via Weight-Data Correlation Preprocessing
Josh Alman
Jiehao Liang
Zhao-quan Song
Ruizhe Zhang
Danyang Zhuo
64
32
0
25 Nov 2022
Federated Adversarial Learning: A Framework with Convergence Analysis
Xiaoxiao Li
Zhao-quan Song
Jiaming Yang
FedML
14
19
0
07 Aug 2022
1