ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2510.02387
88
1

CWM: An Open-Weights LLM for Research on Code Generation with World Models

30 September 2025
FAIR CodeGen team. Jade Copet
Quentin Carbonneaux
Gal Cohen
Jonas Gehring
Jacob Kahn
Jannik Kossen
Felix Kreuk
Emily McMilin
Michel Meyer
Yuxiang Wei
David W. Zhang
Kunhao Zheng
Jordi Armengol-Estapé
Pedram Bashiri
Maximilian Beck
Pierre Chambon
Abhishek Charnalia
Chris Cummins
Juliette Decugis
Zacharias V. Fisches
François Fleuret
Fabian Gloeckle
A. Gu
Michael Hassid
Daniel Haziza
Badr Youbi Idrissi
Christian Keller
Rahul Kindi
Hugh Leather
Gallil Maimon
Aram H. Markosyan
Francisco Massa
Pierre-Emmanuel Mazaré
Vegard Mella
Naila Murray
Keyur Muzumdar
Peter O'Hearn
Matteo Pagliardini
Dmitrii Pedchenko
Tal Remez
Volker Seeker
Marco Selvi
Oren Sultan
Sida I. Wang
Luca Wehrstedt
Ori Yoran
Lingming Zhang
Taco Cohen
Yossi Adi
Gabriel Synnaeve
    OffRLVGenOSLM
ArXiv (abs)PDFHTMLHuggingFace (2 upvotes)Github (629★)
Main:34 Pages
36 Figures
Bibliography:7 Pages
9 Tables
Appendix:17 Pages
Abstract

We release Code World Model (CWM), a 32-billion-parameter open-weights LLM, to advance research on code generation with world models. To improve code understanding beyond what can be learned from training on static code alone, we mid-train CWM on a large amount of observation-action trajectories from Python interpreter and agentic Docker environments, and perform extensive multi-task reasoning RL in verifiable coding, math, and multi-turn software engineering environments. With CWM, we provide a strong testbed for researchers to explore the opportunities world modeling affords for improving code generation with reasoning and planning in computational environments. We present first steps of how world models can benefit agentic coding, enable step-by-step simulation of Python code execution, and show early results of how reasoning can benefit from the latter. CWM is a dense, decoder-only LLM trained with a context size of up to 131k tokens. Independent of its world modeling capabilities, CWM offers strong performance on general coding and math tasks: it reaches pass@1 scores of 65.8% on SWE-bench Verified (with test-time scaling), 68.6% on LiveCodeBench, 96.6% on Math-500, and 76.0% on AIME 2024. To support further research on code world modeling, we release model checkpoints after mid-training, SFT, and RL.

View on arXiv
Comments on this paper