ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.08081
  4. Cited By
Mechanics of Next Token Prediction with Self-Attention

Mechanics of Next Token Prediction with Self-Attention

12 March 2024
Yingcong Li
Yixiao Huang
M. E. Ildiz
A. S. Rawat
Samet Oymak
ArXivPDFHTML

Papers citing "Mechanics of Next Token Prediction with Self-Attention"

7 / 7 papers shown
Title
How Transformers Learn Regular Language Recognition: A Theoretical Study on Training Dynamics and Implicit Bias
How Transformers Learn Regular Language Recognition: A Theoretical Study on Training Dynamics and Implicit Bias
Ruiquan Huang
Yingbin Liang
Jing Yang
46
0
0
02 May 2025
Reasoning Bias of Next Token Prediction Training
Reasoning Bias of Next Token Prediction Training
Pengxiao Lin
Zhongwang Zhang
Zhi-Qin John Xu
LRM
80
1
0
21 Feb 2025
On the Learn-to-Optimize Capabilities of Transformers in In-Context Sparse Recovery
On the Learn-to-Optimize Capabilities of Transformers in In-Context Sparse Recovery
Renpu Liu
Ruida Zhou
Cong Shen
Jing Yang
23
0
0
17 Oct 2024
Implicit Bias and Fast Convergence Rates for Self-attention
Implicit Bias and Fast Convergence Rates for Self-attention
Bhavya Vasudeva
Puneesh Deora
Christos Thrampoulidis
24
13
0
08 Feb 2024
Provably learning a multi-head attention layer
Provably learning a multi-head attention layer
Sitan Chen
Yuanzhi Li
MLT
26
14
0
06 Feb 2024
Towards Understanding the Word Sensitivity of Attention Layers: A Study
  via Random Features
Towards Understanding the Word Sensitivity of Attention Layers: A Study via Random Features
Simone Bombari
Marco Mondelli
26
3
0
05 Feb 2024
How Do Transformers Learn Topic Structure: Towards a Mechanistic
  Understanding
How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding
Yuchen Li
Yuan-Fang Li
Andrej Risteski
107
61
0
07 Mar 2023
1