Geometry of Neural Reinforcement Learning in Continuous State and Action SpacesInternational Conference on Learning Representations (ICLR), 2025 |
Between Circuits and Chomsky: Pre-pretraining on Formal Languages Imparts Linguistic BiasesAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
Towards Precise Scaling Laws for Video Diffusion TransformersComputer Vision and Pattern Recognition (CVPR), 2024 |
Unveiling the Secret Recipe: A Guide For Supervised Fine-Tuning Small
LLMsInternational Conference on Learning Representations (ICLR), 2024 |
Model Fusion through Bayesian Optimization in Language Model Fine-TuningNeural Information Processing Systems (NeurIPS), 2024 |
Scaling Laws for PrecisionInternational Conference on Learning Representations (ICLR), 2024 |
How Does Critical Batch Size Scale in Pre-training?International Conference on Learning Representations (ICLR), 2024 |