ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.21801
37
0

Efficient Joint Prediction of Multiple Future Tokens

24 March 2025
Kwangjun Ahn
Alex Lamb
John Langford
ArXivPDFHTML
Abstract

In this short report, we introduce joint multi-token prediction (JTP), a lightweight modification of standard next-token prediction designed to enrich hidden state representations by jointly predicting multiple future tokens. Unlike previous multi-token prediction approaches, JTP strategically employs teacher forcing of future-tokens through a carefully designed representation bottleneck, allowing the model to encode rich predictive information with minimal computational overhead during training. We show that the JTP approach achieves a short-horizon belief state representation, while popular alternatives for multi-token prediction fail to do so. We demonstrate the effectiveness of our method on the synthetic star graph navigation task from from Bachmann and Nagarajan [2024], highlighting a significant performance improvement over existing methods. This manuscript presents promising preliminary results intended to stimulate further research.

View on arXiv
@article{ahn2025_2503.21801,
  title={ Efficient Joint Prediction of Multiple Future Tokens },
  author={ Kwangjun Ahn and Alex Lamb and John Langford },
  journal={arXiv preprint arXiv:2503.21801},
  year={ 2025 }
}
Comments on this paper