ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.11268
  4. Cited By
Bypassing the Exponential Dependency: Looped Transformers Efficiently Learn In-context by Multi-step Gradient Descent

Bypassing the Exponential Dependency: Looped Transformers Efficiently Learn In-context by Multi-step Gradient Descent

15 October 2024
Bo Chen
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
ArXivPDFHTML

Papers citing "Bypassing the Exponential Dependency: Looped Transformers Efficiently Learn In-context by Multi-step Gradient Descent"

6 / 6 papers shown
Title
Looped ReLU MLPs May Be All You Need as Practical Programmable Computers
Looped ReLU MLPs May Be All You Need as Practical Programmable Computers
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao-quan Song
Yufa Zhou
78
17
0
21 Feb 2025
Fast Gradient Computation for RoPE Attention in Almost Linear Time
Fast Gradient Computation for RoPE Attention in Almost Linear Time
Yifang Chen
Jiayan Huo
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
48
11
0
03 Jan 2025
Advancing the Understanding of Fixed Point Iterations in Deep Neural
  Networks: A Detailed Analytical Study
Advancing the Understanding of Fixed Point Iterations in Deep Neural Networks: A Detailed Analytical Study
Yekun Ke
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
53
3
0
15 Oct 2024
Beyond Linear Approximations: A Novel Pruning Approach for Attention Matrix
Beyond Linear Approximations: A Novel Pruning Approach for Attention Matrix
Yingyu Liang
Jiangxuan Long
Zhenmei Shi
Zhao-quan Song
Yufa Zhou
54
5
0
15 Oct 2024
HSR-Enhanced Sparse Attention Acceleration
HSR-Enhanced Sparse Attention Acceleration
Bo Chen
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao-quan Song
75
17
0
14 Oct 2024
Can Looped Transformers Learn to Implement Multi-step Gradient Descent
  for In-context Learning?
Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?
Khashayar Gatmiry
Nikunj Saunshi
Sashank J. Reddi
Stefanie Jegelka
Sanjiv Kumar
56
17
0
10 Oct 2024
1