ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2212.09748
  4. Cited By
Scalable Diffusion Models with Transformers

Scalable Diffusion Models with Transformers

19 December 2022
William S. Peebles
Saining Xie
    GNN
ArXivPDFHTML

Papers citing "Scalable Diffusion Models with Transformers"

50 / 463 papers shown
Title
GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
Nvidia
Johan Bjorck
Fernando Castañeda
Nikita Cherniadev
Xingye Da
...
Ao Zhang
Hao Zhang
Yizhou Zhao
Ruijie Zheng
Yuke Zhu
VLM
68
22
0
18 Mar 2025
PC-Talk: Precise Facial Animation Control for Audio-Driven Talking Face Generation
PC-Talk: Precise Facial Animation Control for Audio-Driven Talking Face Generation
Baiqin Wang
Xiangyu Zhu
Fan Shen
Hao-Xuan Xu
Zhen Lei
63
0
0
18 Mar 2025
Personalize Anything for Free with Diffusion Transformer
Personalize Anything for Free with Diffusion Transformer
Haoran Feng
Zehuan Huang
Lin Li
Hairong Lv
Lu Sheng
DiffM
87
1
0
16 Mar 2025
UniVG: A Generalist Diffusion Model for Unified Image Generation and Editing
UniVG: A Generalist Diffusion Model for Unified Image Generation and Editing
Tsu-jui Fu
Yusu Qian
Chen Chen
Wenze Hu
Zhe Gan
Y. Yang
97
1
0
16 Mar 2025
Direction-Aware Diagonal Autoregressive Image Generation
Direction-Aware Diagonal Autoregressive Image Generation
Yijia Xu
Jianzhong Ju
Jian Luan
J. Cui
57
0
0
14 Mar 2025
Flow to the Mode: Mode-Seeking Diffusion Autoencoders for State-of-the-Art Image Tokenization
Flow to the Mode: Mode-Seeking Diffusion Autoencoders for State-of-the-Art Image Tokenization
Kyle Sargent
Kyle Hsu
Justin Johnson
L. Fei-Fei
Jiajun Wu
DiffM
MU
53
3
0
14 Mar 2025
FlowTok: Flowing Seamlessly Across Text and Image Tokens
FlowTok: Flowing Seamlessly Across Text and Image Tokens
Ju He
Qihang Yu
Qihao Liu
Liang-Chieh Chen
68
0
0
13 Mar 2025
MotionLab: Unified Human Motion Generation and Editing via the Motion-Condition-Motion Paradigm
MotionLab: Unified Human Motion Generation and Editing via the Motion-Condition-Motion Paradigm
Ziyan Guo
Zeyu Hu
Na Zhao
De Wen Soh
VGen
94
2
0
13 Mar 2025
HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model
HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model
Jiaming Liu
Hao Chen
Pengju An
Zhuoyang Liu
Renrui Zhang
...
Chengkai Hou
Mengdi Zhao
KC alex Zhou
Pheng-Ann Heng
S. Zhang
69
8
0
13 Mar 2025
EEdit: Rethinking the Spatial and Temporal Redundancy for Efficient Image Editing
EEdit: Rethinking the Spatial and Temporal Redundancy for Efficient Image Editing
Zexuan Yan
Yue Ma
Chang Zou
Wenteng Chen
Qifeng Chen
Linfeng Zhang
63
0
0
13 Mar 2025
R^RRFLAV: Rolling Flow matching for infinite Audio Video generation
Alex Ergasti
Giuseppe Tarollo
Filippo Botti
Tomaso Fontanini
Claudio Ferrari
Massimo Bertozzi
Andrea Prati
VGen
45
0
0
13 Mar 2025
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality
Zhengyao Lv
Chenyang Si
Junhao Song
Zhenyu Yang
Yu Qiao
Ziwei Liu
Kwan-Yee K. Wong
VGen
DiffM
81
8
0
13 Mar 2025
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Marianne Arriola
Aaron Gokaslan
Justin T Chiu
Zhihan Yang
Zhixuan Qi
Jiaqi Han
S. Sahoo
Volodymyr Kuleshov
DiffM
72
5
0
12 Mar 2025
Discovering Influential Neuron Path in Vision Transformers
Discovering Influential Neuron Path in Vision Transformers
Yifan Wang
Yifei Liu
Yingdong Shi
C. Li
Anqi Pang
Sibei Yang
Jingyi Yu
Kan Ren
ViT
69
0
0
12 Mar 2025
Exploring Position Encoding in Diffusion U-Net for Training-free High-resolution Image Generation
Feng Zhou
Pu Cao
Yiyang Ma
Lu Yang
Jianqin Yin
DiffM
51
0
0
12 Mar 2025
Reangle-A-Video: 4D Video Generation as Video-to-Video Translation
Reangle-A-Video: 4D Video Generation as Video-to-Video Translation
Hyeonho Jeong
Suhyeon Lee
Jong Chul Ye
VGen
163
0
0
12 Mar 2025
Long-horizon Visual Instruction Generation with Logic and Attribute Self-reflection
Long-horizon Visual Instruction Generation with Logic and Attribute Self-reflection
Yucheng Suo
Fan Ma
Kaixin Shen
Linchao Zhu
Yi Yang
VLM
52
0
0
12 Mar 2025
Robust Latent Matters: Boosting Image Generation with Sampling Error Synthesis
Robust Latent Matters: Boosting Image Generation with Sampling Error Synthesis
Kai Qiu
X. Li
Jason Kuen
H. Chen
Xiaohao Xu
Jiuxiang Gu
Yinyi Luo
Bhiksha Raj
Zhe-nan Lin
Marios Savvides
62
0
0
11 Mar 2025
DexGrasp Anything: Towards Universal Robotic Dexterous Grasping with Physics Awareness
DexGrasp Anything: Towards Universal Robotic Dexterous Grasping with Physics Awareness
Yiming Zhong
Qi Jiang
Jingyi Yu
Yuexin Ma
58
2
0
11 Mar 2025
Aligning Text to Image in Diffusion Models is Easier Than You Think
Aligning Text to Image in Diffusion Models is Easier Than You Think
J. Lee
Byunghee Cha
Jeongsol Kim
Jong Chul Ye
52
0
0
11 Mar 2025
Rethinking Diffusion Model in High Dimension
Rethinking Diffusion Model in High Dimension
Zhenxin Zheng
Zhenjie Zheng
DiffM
46
0
0
11 Mar 2025
Unleashing the Potential of Large Language Models for Text-to-Image Generation through Autoregressive Representation Alignment
Unleashing the Potential of Large Language Models for Text-to-Image Generation through Autoregressive Representation Alignment
Xing Xie
Jiawei Liu
Ziyue Lin
Huijie Fan
Zhi-Long Han
Yandong Tang
Liangqiong Qu
44
0
0
10 Mar 2025
Effective and Efficient Masked Image Generation Models
Effective and Efficient Masked Image Generation Models
Zebin You
Jingyang Ou
Xiaolu Zhang
Jun Hu
Jun Zhou
Chongxuan Li
DiffM
VLM
64
1
0
10 Mar 2025
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications
Siyuan Mu
Sen Lin
MoE
135
1
0
10 Mar 2025
Post-Training Quantization for Diffusion Transformer via Hierarchical Timestep Grouping
Post-Training Quantization for Diffusion Transformer via Hierarchical Timestep Grouping
Ning Ding
Jing Han
Yuchuan Tian
Chao Xu
Kai Han
Yehui Tang
MQ
44
0
0
10 Mar 2025
FaceID-6M: A Large-Scale, Open-Source FaceID Customization Dataset
FaceID-6M: A Large-Scale, Open-Source FaceID Customization Dataset
Shuhe Wang
Xiaoya Li
Jiwei Li
G. Wang
Xiaofei Sun
...
Han Qiu
Mo Yu
Shengjie Shen
Tianwei Zhang
Eduard H. Hovy
VLM
63
0
0
10 Mar 2025
What's in a Latent? Leveraging Diffusion Latent Space for Domain Generalization
What's in a Latent? Leveraging Diffusion Latent Space for Domain Generalization
Xavier Thomas
Deepti Ghadiyaram
DiffM
92
0
0
09 Mar 2025
Infinite Leagues Under the Sea: Photorealistic 3D Underwater Terrain Generation by Latent Fractal Diffusion Models
Tianyi Zhang
Weiming Zhi
Joshua Mangelson
Matthew Johnson-Roberson
45
0
0
09 Mar 2025
Conceptrol: Concept Control of Zero-shot Personalized Image Generation
Qiyuan He
Angela Yao
DiffM
41
0
0
09 Mar 2025
DynamicID: Zero-Shot Multi-ID Image Personalization with Flexible Facial Editability
Xirui Hu
Jiahao Wang
Hao Chen
Weizhan Zhang
Benqi Wang
Y. Li
Haishun Nan
DiffM
67
0
0
09 Mar 2025
BioMoDiffuse: Physics-Guided Biomechanical Diffusion for Controllable and Authentic Human Motion Synthesis
Zixi Kang
Xinghan Wang
Yadong Mu
VGen
86
0
0
08 Mar 2025
USP: Unified Self-Supervised Pretraining for Image Generation and Understanding
Xiangxiang Chu
Renda Li
Yong Wang
62
0
0
08 Mar 2025
VACT: A Video Automatic Causal Testing System and a Benchmark
VACT: A Video Automatic Causal Testing System and a Benchmark
Haotong Yang
Qingyuan Zheng
Yunjian Gao
Yongkun Yang
Yangbo He
Zhouchen Lin
Muhan Zhang
VGen
CML
59
0
0
08 Mar 2025
Generative Trajectory Stitching through Diffusion Composition
Generative Trajectory Stitching through Diffusion Composition
Yunhao Luo
Utkarsh Aashu Mishra
Yilun Du
Danfei Xu
135
1
0
07 Mar 2025
AirExo-2: Scaling up Generalizable Robotic Imitation Learning with Low-Cost Exoskeletons
AirExo-2: Scaling up Generalizable Robotic Imitation Learning with Low-Cost Exoskeletons
Hongjie Fang
Chenxi Wang
Yiming Wang
J. Chen
Shangning Xia
...
Xinyu Zhan
Lixin Yang
Weiming Wang
Cewu Lu
Hao-Shu Fang
82
1
0
05 Mar 2025
All-atom Diffusion Transformers: Unified generative modelling of molecules and materials
Chaitanya K. Joshi
Xiang Fu
Yi-Lun Liao
Vahe Gharakhanyan
Benjamin Kurt Miller
Anuroop Sriram
Zachary W. Ulissi
DiffM
53
4
0
05 Mar 2025
ProReflow: Progressive Reflow with Decomposed Velocity
Lei Ke
Haohang Xu
Xuefei Ning
Y. Li
J. Li
Haoling Li
Yuxuan Lin
Dongsheng Jiang
Y. Yang
Linfeng Zhang
DiffM
62
1
0
05 Mar 2025
Multi-agent Auto-Bidding with Latent Graph Diffusion Models
Multi-agent Auto-Bidding with Latent Graph Diffusion Models
Dom Huh
P. Mohapatra
DiffM
AI4CE
41
0
0
04 Mar 2025
Bayesian Inverse Problems Meet Flow Matching: Efficient and Flexible Inference via Transformers
Daniil Sherki
Ivan V. Oseledets
Ekaterina A. Muravleva
55
0
0
03 Mar 2025
Enhancing Retinal Vessel Segmentation Generalization via Layout-Aware Generative Modelling
Enhancing Retinal Vessel Segmentation Generalization via Layout-Aware Generative Modelling
Jonathan Fhima
Jan Van Eijgen
Lennert Beeckmans
Thomas Jacobs
Moti Freiman
Luis Filipe Nakayama
Ingeborg Stalmans
Chaim Baskin
Joachim A. Behar
MedIm
69
0
0
03 Mar 2025
Direct Discriminative Optimization: Your Likelihood-Based Visual Generative Model is Secretly a GAN Discriminator
Direct Discriminative Optimization: Your Likelihood-Based Visual Generative Model is Secretly a GAN Discriminator
Kaiwen Zheng
Yongxin Chen
Huayu Chen
Guande He
Ming-Yu Liu
J. Zhu
Qinsheng Zhang
DiffM
49
0
0
03 Mar 2025
Neural ODE Transformers: Analyzing Internal Dynamics and Adaptive Fine-tuning
Neural ODE Transformers: Analyzing Internal Dynamics and Adaptive Fine-tuning
Anh Tong
Thanh Nguyen-Tang
Dongeun Lee
Duc Nguyen
Toan M. Tran
David Hall
Cheongwoong Kang
Jaesik Choi
35
0
0
03 Mar 2025
Zero-Shot Head Swapping in Real-World Scenarios
Zero-Shot Head Swapping in Real-World Scenarios
S. Jeong
Taewoong Kang
Hyojin Jang
Jaegul Choo
34
0
0
02 Mar 2025
A Review on Generative AI For Text-To-Image and Image-To-Image Generation and Implications To Scientific Images
A Review on Generative AI For Text-To-Image and Image-To-Image Generation and Implications To Scientific Images
Zineb Sordo
Eric Chagnon
Daniela Ushizima
EGVM
MedIm
66
1
0
28 Feb 2025
Raccoon: Multi-stage Diffusion Training with Coarse-to-Fine Curating Videos
Raccoon: Multi-stage Diffusion Training with Coarse-to-Fine Curating Videos
Zhiyu Tan
Junyan Wang
Hao Yang
Luozheng Qin
Hesen Chen
Qiang-feng Zhou
Hao Li
VGen
69
0
0
28 Feb 2025
Spiking Transformer:Introducing Accurate Addition-Only Spiking Self-Attention for Transformer
Spiking Transformer:Introducing Accurate Addition-Only Spiking Self-Attention for Transformer
Yufei Guo
Xiaode Liu
Y. Chen
Weihang Peng
Yuhan Zhang
Zhe Ma
MQ
43
1
0
28 Feb 2025
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute
Sotiris Anagnostidis
Gregor Bachmann
Yeongmin Kim
Jonas Kohler
Markos Georgopoulos
A. Sanakoyeu
Yuming Du
Albert Pumarola
Ali K. Thabet
Edgar Schönfeld
92
0
0
27 Feb 2025
Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation
Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation
Sucheng Ren
Qihang Yu
Ju He
Xiaohui Shen
Alan Yuille
Liang-Chieh Chen
VGen
83
6
0
27 Feb 2025
TransVDM: Motion-Constrained Video Diffusion Model for Transparent Video Synthesis
TransVDM: Motion-Constrained Video Diffusion Model for Transparent Video Synthesis
Menghao Li
Zhenghao Zhang
Junchao Liao
Long Qin
Weizhi Wang
DiffM
VGen
69
0
0
26 Feb 2025
MAD-AD: Masked Diffusion for Unsupervised Brain Anomaly Detection
MAD-AD: Masked Diffusion for Unsupervised Brain Anomaly Detection
Farzad Beizaee
Gregory A. Lodygensky
Christian Desrosiers
Jose Dolz
DiffM
MedIm
42
0
0
24 Feb 2025
Previous
123456...8910
Next