ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.08740
  4. Cited By
SiT: Exploring Flow and Diffusion-based Generative Models with Scalable
  Interpolant Transformers

SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers

16 January 2024
Nanye Ma
Mark Goldstein
M. S. Albergo
Nicholas M. Boffi
Eric Vanden-Eijnden
Saining Xie
    DiffM
ArXivPDFHTML

Papers citing "SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers"

50 / 139 papers shown
Title
Parallel Sequence Modeling via Generalized Spatial Propagation Network
Parallel Sequence Modeling via Generalized Spatial Propagation Network
Hongjun Wang
Wonmin Byeon
Jiarui Xu
Jinwei Gu
Ka Chun Cheung
Xiaolong Wang
Kai Han
Jan Kautz
Sifei Liu
58
0
0
21 Jan 2025
DiC: Rethinking Conv3x3 Designs in Diffusion Models
Yuchuan Tian
Jing Han
Chengcheng Wang
Yuchen Liang
Chao Xu
Hanting Chen
DiffM
21
1
0
03 Jan 2025
Towards Precise Scaling Laws for Video Diffusion Transformers
Towards Precise Scaling Laws for Video Diffusion Transformers
Yuanyang Yin
Yaqi Zhao
Mingwu Zheng
Ke Lin
Jiarong Ou
...
Pengfei Wan
Di Zhang
Baoqun Yin
Wentao Zhang
Kun Gai
119
2
0
03 Jan 2025
E-CAR: Efficient Continuous Autoregressive Image Generation via
  Multistage Modeling
E-CAR: Efficient Continuous Autoregressive Image Generation via Multistage Modeling
Zhihang Yuan
Yuzhang Shang
H. Zhang
Tongcheng Fang
Rui Xie
Bingxin Xu
Yan Yan
Shengen Yan
Guohao Dai
Yu Wang
DiffM
94
1
0
18 Dec 2024
VinTAGe: Joint Video and Text Conditioning for Holistic Audio Generation
VinTAGe: Joint Video and Text Conditioning for Holistic Audio Generation
Saksham Singh Kushwaha
Yapeng Tian
DiffM
VGen
71
2
0
14 Dec 2024
SoftVQ-VAE: Efficient 1-Dimensional Continuous Tokenizer
SoftVQ-VAE: Efficient 1-Dimensional Continuous Tokenizer
H. Chen
Z. Wang
X. Li
X. Sun
Fangyi Chen
Jiang Liu
J. Wang
Bhiksha Raj
Zicheng Liu
Emad Barsoum
VLM
103
6
0
14 Dec 2024
SVGFusion: Scalable Text-to-SVG Generation via Vector Space Diffusion
SVGFusion: Scalable Text-to-SVG Generation via Vector Space Diffusion
Ximing Xing
Juncheng Hu
Jing Zhang
Dong Xu
Qian Yu
76
1
0
11 Dec 2024
StyleMaster: Stylize Your Video with Artistic Generation and Translation
StyleMaster: Stylize Your Video with Artistic Generation and Translation
Zixuan Ye
Huijuan Huang
Xintao Wang
Pengfei Wan
Di Zhang
Wenhan Luo
DiffM
VGen
92
4
0
10 Dec 2024
[MASK] is All You Need
[MASK] is All You Need
Vincent Tao Hu
Bjorn Ommer
DiffM
135
2
0
09 Dec 2024
Coordinate In and Value Out: Training Flow Transformers in Ambient Space
Coordinate In and Value Out: Training Flow Transformers in Ambient Space
Yuyang Wang
Anurag Ranjan
J. Susskind
Miguel Angel Bautista
3DPC
63
0
0
05 Dec 2024
RandAR: Decoder-only Autoregressive Visual Generation in Random Orders
RandAR: Decoder-only Autoregressive Visual Generation in Random Orders
Ziqi Pang
Tianyuan Zhang
Fujun Luan
Yunze Man
Hao Tan
Kai Zhang
William T. Freeman
Yu-Xiong Wang
VGen
64
12
0
02 Dec 2024
HoloDrive: Holistic 2D-3D Multi-Modal Street Scene Generation for
  Autonomous Driving
HoloDrive: Holistic 2D-3D Multi-Modal Street Scene Generation for Autonomous Driving
Z. Wu
Jingcheng Ni
Xiaodong Wang
Yuxin Guo
Rui Chen
Lewei Lu
Jifeng Dai
Yuwen Xiong
72
6
0
02 Dec 2024
Pretrained Reversible Generation as Unsupervised Visual Representation Learning
Pretrained Reversible Generation as Unsupervised Visual Representation Learning
Rongkun Xue
Jinouwen Zhang
Yazhe Niu
Dazhong Shen
Bingqi Ma
Yu Liu
Jing Yang
65
0
0
29 Nov 2024
Spatiotemporal Skip Guidance for Enhanced Video Diffusion Sampling
Spatiotemporal Skip Guidance for Enhanced Video Diffusion Sampling
J. Hyung
Kinam Kim
Susung Hong
M. Kim
Jaegul Choo
VGen
67
3
0
27 Nov 2024
COAP: Memory-Efficient Training with Correlation-Aware Gradient Projection
Jinqi Xiao
S. Sang
Tiancheng Zhi
Jing Liu
Qing Yan
Linjie Luo
Bo Yuan
Bo Yuan
VLM
81
1
0
26 Nov 2024
vesselFM: A Foundation Model for Universal 3D Blood Vessel Segmentation
vesselFM: A Foundation Model for Universal 3D Blood Vessel Segmentation
Bastian Wittmann
Yannick Wattenberg
Tamaz Amiranashvili
Suprosanna Shit
Bjoern H. Menze
74
3
0
26 Nov 2024
Exploring Discrete Flow Matching for 3D De Novo Molecule Generation
Exploring Discrete Flow Matching for 3D De Novo Molecule Generation
Ian Dunn
D. Koes
90
4
0
25 Nov 2024
Rethinking Diffusion for Text-Driven Human Motion Generation
Rethinking Diffusion for Text-Driven Human Motion Generation
Zichong Meng
Yiming Xie
Xiaogang Peng
Zeyu Han
Huaizu Jiang
VGen
75
0
0
25 Nov 2024
Unveil Inversion and Invariance in Flow Transformer for Versatile Image Editing
Unveil Inversion and Invariance in Flow Transformer for Versatile Image Editing
P. Xu
Boyuan Jiang
Xiaobin Hu
Donghao Luo
Q. He
J. Zhang
Chengjie Wang
Yunsheng Wu
Charles X. Ling
Boyu Wang
87
2
0
24 Nov 2024
Diverse capability and scaling of diffusion and auto-regressive models
  when learning abstract rules
Diverse capability and scaling of diffusion and auto-regressive models when learning abstract rules
Binxu Wang
Jiaqi Shang
Haim Sompolinsky
DiffM
28
1
0
12 Nov 2024
GaussianAnything: Interactive Point Cloud Flow Matching For 3D Object Generation
GaussianAnything: Interactive Point Cloud Flow Matching For 3D Object Generation
Yushi Lan
Shangchen Zhou
Zhaoyang Lyu
Fangzhou Hong
Shuai Yang
Bo Dai
Xingang Pan
Chen Change Loy
3DGS
53
3
0
12 Nov 2024
DiMSUM: Diffusion Mamba -- A Scalable and Unified Spatial-Frequency Method for Image Generation
DiMSUM: Diffusion Mamba -- A Scalable and Unified Spatial-Frequency Method for Image Generation
Hao Phung
Quan Dao
T. Dao
Hoang Phan
Dimitris Metaxas
Anh Tran
Mamba
57
3
0
06 Nov 2024
Randomized Autoregressive Visual Generation
Randomized Autoregressive Visual Generation
Qihang Yu
Ju He
XueQing Deng
Xiaohui Shen
Liang-Chieh Chen
VGen
DiffM
50
28
1
01 Nov 2024
FlowDCN: Exploring DCN-like Architectures for Fast Image Generation with
  Arbitrary Resolution
FlowDCN: Exploring DCN-like Architectures for Fast Image Generation with Arbitrary Resolution
Shuai Wang
Zexian Li
Tianhui Song
Xubin Li
Tiezheng Ge
Bo Zheng
L. Wang
22
1
0
30 Oct 2024
Adversarial Score identity Distillation: Rapidly Surpassing the Teacher
  in One Step
Adversarial Score identity Distillation: Rapidly Surpassing the Teacher in One Step
Mingyuan Zhou
Huangjie Zheng
Yi Gu
Zhendong Wang
Hai Huang
DiffM
39
4
0
19 Oct 2024
BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities
BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities
Shaozhe Hao
Xuantong Liu
Xianbiao Qi
Shihao Zhao
Bojia Zi
Rong Xiao
Kai Han
Kwan-Yee K. Wong
41
3
0
18 Oct 2024
FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion
  Model
FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion Model
ZiDong Wang
Zeyu Lu
Di Huang
Cai Zhou
Wanli Ouyang
and Lei Bai
69
3
0
17 Oct 2024
Diff-SAGe: End-to-End Spatial Audio Generation Using Diffusion Models
Diff-SAGe: End-to-End Spatial Audio Generation Using Diffusion Models
Saksham Singh Kushwaha
Jianbo Ma
Mark R. P. Thomas
Yapeng Tian
Avery Bruni
27
1
0
15 Oct 2024
FasterDiT: Towards Faster Diffusion Transformers Training without
  Architecture Modification
FasterDiT: Towards Faster Diffusion Transformers Training without Architecture Modification
J. Yao
Wang Cheng
Wenyu Liu
Xinggang Wang
38
8
0
14 Oct 2024
EquiJump: Protein Dynamics Simulation via SO(3)-Equivariant Stochastic
  Interpolants
EquiJump: Protein Dynamics Simulation via SO(3)-Equivariant Stochastic Interpolants
Allan dos Santos Costa
Ilan Mitnikov
Franco Pellegrini
Ameya Daigavane
Mario Geiger
Zhonglin Cao
Karsten Kreis
Tess E. Smidt
E. Küçükbenli
J. Jacobson
27
0
0
12 Oct 2024
Diffusion Models Need Visual Priors for Image Generation
Diffusion Models Need Visual Priors for Image Generation
Xiaoyu Yue
Zidong Wang
Zeyu Lu
S. Sun
Meng Wei
Wanli Ouyang
Lei Bai
Luping Zhou
VLM
40
1
0
11 Oct 2024
Scaling Laws For Diffusion Transformers
Scaling Laws For Diffusion Transformers
Zhengyang Liang
Hao He
Ceyuan Yang
Bo Dai
27
8
0
10 Oct 2024
I-Max: Maximize the Resolution Potential of Pre-trained Rectified Flow
  Transformers with Projected Flow
I-Max: Maximize the Resolution Potential of Pre-trained Rectified Flow Transformers with Projected Flow
Ruoyi Du
Dongyang Liu
Le Zhuo
Qin Qi
Hongsheng Li
Zhanyu Ma
Peng Gao
24
1
0
10 Oct 2024
EventFlow: Forecasting Continuous-Time Event Data with Flow Matching
EventFlow: Forecasting Continuous-Time Event Data with Flow Matching
Gavin Kerrigan
Kai Nelson
Padhraic Smyth
AI4TS
22
1
0
09 Oct 2024
EvolveDirector: Approaching Advanced Text-to-Image Generation with Large
  Vision-Language Models
EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models
Rui Zhao
Hangjie Yuan
Yujie Wei
Shiwei Zhang
Yuchao Gu
...
Xiang Wang
Zhangjie Wu
Junhao Zhang
Yingya Zhang
Mike Zheng Shou
DiffM
VLM
50
4
0
09 Oct 2024
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
Sihyun Yu
Sangkyung Kwak
Huiwon Jang
Jongheon Jeong
Jonathan Huang
Jinwoo Shin
Saining Xie
OCL
65
59
0
09 Oct 2024
VEDIT: Latent Prediction Architecture For Procedural Video
  Representation Learning
VEDIT: Latent Prediction Architecture For Procedural Video Representation Learning
Han Lin
Tushar Nagarajan
Nicolas Ballas
Mido Assran
Mojtaba Komeili
Mohit Bansal
Koustuv Sinha
AI4TS
49
3
0
04 Oct 2024
Dynamic Diffusion Transformer
Dynamic Diffusion Transformer
Wangbo Zhao
Yizeng Han
Jiasheng Tang
Kai Wang
Yibing Song
Gao Huang
Fan Wang
Yang You
71
11
0
04 Oct 2024
Stochastic Sampling from Deterministic Flow Models
Stochastic Sampling from Deterministic Flow Models
Saurabh Singh
Ian S. Fischer
26
2
0
03 Oct 2024
Denoising with a Joint-Embedding Predictive Architecture
Denoising with a Joint-Embedding Predictive Architecture
Dengsheng Chen
Jie Hu
Xiaoming Wei
Enhua Wu
DiffM
47
2
0
02 Oct 2024
Effective Diffusion Transformer Architecture for Image Super-Resolution
Effective Diffusion Transformer Architecture for Image Super-Resolution
Kun Cheng
Lei Yu
Zhijun Tu
Xiao He
Liyu Chen
Yong Guo
Mingrui Zhu
Nannan Wang
Xinbo Gao
Jie Hu
29
0
0
29 Sep 2024
FlowTurbo: Towards Real-time Flow-Based Image Generation with Velocity
  Refiner
FlowTurbo: Towards Real-time Flow-Based Image Generation with Velocity Refiner
Wenliang Zhao
Minglei Shi
Xumin Yu
Jie Zhou
Jiwen Lu
27
0
0
26 Sep 2024
Generative Modeling of Molecular Dynamics Trajectories
Generative Modeling of Molecular Dynamics Trajectories
Bowen Jing
Hannes Stärk
Tommi Jaakkola
Bonnie Berger
AI4CE
27
14
0
26 Sep 2024
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions
Weifeng Lin
Xinyu Wei
Renrui Zhang
Le Zhuo
Shitian Zhao
...
Junlin Xie
Junlin Xie
Yu Qiao
Peng Gao
Hongsheng Li
MLLM
DiffM
50
10
0
23 Sep 2024
Differentially Private Kernel Density Estimation
Differentially Private Kernel Density Estimation
Erzhi Liu
Jerry Yao-Chieh Hu
Alex Reneau
Zhao Song
Han Liu
56
3
0
03 Sep 2024
Efficient Diffusion Transformer with Step-wise Dynamic Attention
  Mediators
Efficient Diffusion Transformer with Step-wise Dynamic Attention Mediators
Yifan Pu
Zhuofan Xia
Jiayi Guo
Dongchen Han
Qixiu Li
...
Ji Li
Yizeng Han
Shiji Song
Gao Huang
Xiu Li
53
11
0
11 Aug 2024
Puppet-Master: Scaling Interactive Video Generation as a Motion Prior
  for Part-Level Dynamics
Puppet-Master: Scaling Interactive Video Generation as a Motion Prior for Part-Level Dynamics
Ruining Li
Chuanxia Zheng
Christian Rupprecht
Andrea Vedaldi
DiffM
VGen
36
9
0
08 Aug 2024
Scaling Diffusion Transformers to 16 Billion Parameters
Scaling Diffusion Transformers to 16 Billion Parameters
Zhengcong Fei
Mingyuan Fan
Changqian Yu
Debang Li
Junshi Huang
DiffM
MoE
54
15
0
16 Jul 2024
MuDiT & MuSiT: Alignment with Colloquial Expression in
  Description-to-Song Generation
MuDiT & MuSiT: Alignment with Colloquial Expression in Description-to-Song Generation
Zihao Wang
Haoxuan Liu
Jiaxing Yu
Tao Zhang
Yan Liu
K. Zhang
52
1
0
03 Jul 2024
Neural Residual Diffusion Models for Deep Scalable Vision Generation
Neural Residual Diffusion Models for Deep Scalable Vision Generation
Zhiyuan Ma
Liangliang Zhao
Biqing Qi
Bowen Zhou
DiffM
53
2
0
19 Jun 2024
Previous
123
Next