Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Home
Papers
2508.15884
Cited By
v1
v2
v3 (latest)
Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search
21 August 2025
Yuxian Gu
Qinghao Hu
Shang Yang
Haocheng Xi
Junyu Chen
Song Han
Han Cai
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (2 upvotes)
Github (663★)
Papers citing
"Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search"
9 / 9 papers shown
Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models
Y. Fu
Xin Dong
Shizhe Diao
Matthijs Van Keirsbilck
Hanrong Ye
...
Maksim Khadkevich
A. Keller
Jan Kautz
Y. Lin
Pavlo Molchanov
158
0
0
24 Nov 2025
E
3
^3
3
-Pruner: Towards Efficient, Economical, and Effective Layer Pruning for Large Language Models
Tao Yuan
Haoli Bai
Yinfei Pan
Xuyang Cao
Tianyu Zhang
Lu Hou
Ting Hu
Xianzhi Yu
VLM
211
0
0
21 Nov 2025
Training Foundation Models on a Full-Stack AMD Platform: Compute, Networking, and System Design
Quentin G. Anthony
Yury Tokpanov
Skyler Szot
Srivatsan Rajagopal
Praneeth Medepalli
...
Emad Barsoum
Zhenyu Gu
Yao Fu
Beren Millidge
Beren Millidge
MoE
VLM
LRM
257
0
0
21 Nov 2025
Apriel-H1: Towards Efficient Enterprise Reasoning Models
Oleksiy Ostapenko
Luke Kumar
Raymond Li
Denis Kocetkov
J. Lamy-Poirier
...
Sébastien Paquet
Srinivas Sunkara
Valérie Bécaert
Sathwik Tejaswi Madhusudhan
Torsten Scholak
LRM
128
2
0
04 Nov 2025
Kimi Linear: An Expressive, Efficient Attention Architecture
Kimi Team
Yu Zhang
Zongyu Lin
Xingcheng Yao
J. Hu
...
Guokun Lai
Yuxin Wu
Xinyu Zhou
Zhilin Yang
Yulun Du
132
8
0
30 Oct 2025
Hybrid Architectures for Language Models: Systematic Analysis and Design Insights
Sangmin Bae
Bilge Acun
Haroun Habeeb
S. Kim
Chien-Yu Lin
Liang Luo
Junjie Wang
Carole-Jean Wu
148
4
0
06 Oct 2025
Composer: A Search Framework for Hybrid Neural Architecture Design
Bilge Acun
Prasoon Sinha
Newsha Ardalani
Sangmin Bae
Alicia Golden
Chien-Yu Lin
Meghana Madhyastha
Fei Sun
N. Yadwadkar
Carole-Jean Wu
216
1
0
01 Oct 2025
DC-Gen: Post-Training Diffusion Acceleration with Deeply Compressed Latent Space
Wenkun He
Yuchao Gu
Junyu Chen
Dongyun Zou
Yujun Lin
...
Jincheng Yu
Junsong Chen
Enze Xie
Song Han
Han Cai
209
2
0
29 Sep 2025
DC-VideoGen: Efficient Video Generation with Deep Compression Video Autoencoder
Junyu Chen
Wenkun He
Yuchao Gu
Yuyang Zhao
Jincheng Yu
...
Haocheng Xi
Ligeng Zhu
Enze Xie
Song Han
Han Cai
VGen
174
2
0
29 Sep 2025
1