ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.08362
  4. Cited By
Transformers can optimally learn regression mixture models

Transformers can optimally learn regression mixture models

International Conference on Learning Representations (ICLR), 2023
14 November 2023
Reese Pathak
Rajat Sen
Weihao Kong
Abhimanyu Das
ArXiv (abs)PDFHTML

Papers citing "Transformers can optimally learn regression mixture models"

7 / 7 papers shown
Title
Theory of Scaling Laws for In-Context Regression: Depth, Width, Context and Time
Theory of Scaling Laws for In-Context Regression: Depth, Width, Context and Time
Blake Bordelon
Mary I. Letey
Cengiz Pehlevan
88
0
0
01 Oct 2025
Limitations of refinement methods for weak to strong generalization
Limitations of refinement methods for weak to strong generalization
Seamus Somerstep
Yaácov Ritov
Mikhail Yurochkin
Subha Maity
Yuekai Sun
76
1
0
23 Aug 2025
On the Robustness of Transformers against Context Hijacking for Linear Classification
On the Robustness of Transformers against Context Hijacking for Linear Classification
Tianle Li
Chenyang Zhang
Xingwu Chen
Yuan Cao
Difan Zou
269
3
0
24 Feb 2025
On the Training Convergence of Transformers for In-Context Classification of Gaussian Mixtures
On the Training Convergence of Transformers for In-Context Classification of Gaussian Mixtures
Wei Shen
Ruida Zhou
Jing Yang
Cong Shen
238
6
0
15 Oct 2024
A Theoretical Understanding of Self-Correction through In-context
  Alignment
A Theoretical Understanding of Self-Correction through In-context Alignment
Yifei Wang
Yuyang Wu
Zeming Wei
Stefanie Jegelka
Yisen Wang
LRM
206
50
0
28 May 2024
In-Context Learning of a Linear Transformer Block: Benefits of the MLP
  Component and One-Step GD Initialization
In-Context Learning of a Linear Transformer Block: Benefits of the MLP Component and One-Step GD Initialization
Ruiqi Zhang
Jingfeng Wu
Peter L. Bartlett
237
25
0
22 Feb 2024
Linear Transformers are Versatile In-Context Learners
Linear Transformers are Versatile In-Context Learners
Max Vladymyrov
J. Oswald
Mark Sandler
Rong Ge
142
27
0
21 Feb 2024
1