Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2112.10752
Cited By
High-Resolution Image Synthesis with Latent Diffusion Models
20 December 2021
Robin Rombach
A. Blattmann
Dominik Lorenz
Patrick Esser
Bjorn Ommer
3DV
Re-assign community
ArXiv
PDF
HTML
Papers citing
"High-Resolution Image Synthesis with Latent Diffusion Models"
50 / 7,988 papers shown
Title
Can You Count to Nine? A Human Evaluation Benchmark for Counting Limits in Modern Text-to-Video Models
Xuyang Guo
Zekai Huang
Jiayan Huo
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
Jiahao Zhang
ALM
VGen
68
2
0
05 Apr 2025
Deconstructing Bias: A Multifaceted Framework for Diagnosing Cultural and Compositional Inequities in Text-to-Image Generative Models
Muna Numan Said
Aarib Zaidi
Rabia Usman
Sonia Okon
Praneeth Medepalli
Kevin Zhu
Vasu Sharma
Sean O'Brien
22
0
0
05 Apr 2025
Could AI Trace and Explain the Origins of AI-Generated Images and Text?
Hongchao Fang
Yixin Liu
R. Xu
Can Qin
Y. Liu
Feng Liu
Lichao Sun
Dongwon Lee
Lifu Huang
Wenpeng Yin
DeLMO
62
0
0
05 Apr 2025
DiTaiListener: Controllable High Fidelity Listener Video Generation with Diffusion
Maksim Siniukov
Di Chang
Minh Tran
Hongkun Gong
Ashutosh Chaubey
Mohammad Soleymani
DiffM
VGen
23
0
0
05 Apr 2025
Embedding Hidden Adversarial Capabilities in Pre-Trained Diffusion Models
Lucas Beerens
D. Higham
DiffM
WIGM
55
0
0
05 Apr 2025
Loss Functions in Deep Learning: A Comprehensive Review
Omar Elharrouss
Yasir Mahmood
Yassine Bechqito
Mohamed Adel Serhani
E. Badidi
Jamal Riffi
Hamid Tairi
33
0
0
05 Apr 2025
Conditioning Diffusions Using Malliavin Calculus
Jakiw Pidstrigach
Elizabeth Baker
Carles Domingo-Enrich
George Deligiannidis
Nikolas Nüsken
DiffM
33
0
0
04 Apr 2025
Detection Limits and Statistical Separability of Tree Ring Watermarks in Rectified Flow-based Text-to-Image Generation Models
Ved Umrajkar
Aakash Kumar Singh
23
0
0
04 Apr 2025
Simultaneous Learning of Optimal Transports for Training All-to-All Flow-Based Condition Transfer Model
Kotaro Ikeda
Masanori Koyama
Jinzhe Zhang
Kohei Hayashi
Kenji Fukumizu
OT
92
0
0
04 Apr 2025
Generating ensembles of spatially-coherent in-situ forecasts using flow matching
David Landry
C. Monteleoni
A. Charantonis
60
0
0
04 Apr 2025
Physics-informed 4D X-ray image reconstruction from ultra-sparse spatiotemporal data
Zisheng Yao
Yuhe Zhang
Zhe Hu
Robert Klöfkorn
Tobias Ritschel
Pablo Villanueva-Perez
AI4CE
64
1
0
04 Apr 2025
Learning Natural Language Constraints for Safe Reinforcement Learning of Language Agents
Jaymari Chua
Chen Wang
Lina Yao
ALM
45
1
0
04 Apr 2025
Dynamic Objective MPC for Motion Planning of Seamless Docking Maneuvers
Oliver Schumann
Michael Buchholz
Klaus C. J. Dietmayer
38
0
0
04 Apr 2025
A Hybrid Wavelet-Fourier Method for Next-Generation Conditional Diffusion Models
Andrew Kiruluta
Andreas Lemos
DiffM
28
3
0
04 Apr 2025
3D Scene Understanding Through Local Random Access Sequence Modeling
Wanhee Lee
Klemen Kotar
R. Venkatesh
Jared Watrous
Honglin Chen
Khai Loong Aw
Daniel L. K. Yamins
3DV
34
0
0
04 Apr 2025
LV-MAE: Learning Long Video Representations through Masked-Embedding Autoencoders
Ilan Naiman
Emanuel Ben-Baruch
Oron Anschel
Alon Shoshan
Igor Kviatkovsky
Manoj Aggarwal
Gérard Medioni
34
0
0
04 Apr 2025
HumanDreamer-X: Photorealistic Single-image Human Avatars Reconstruction via Gaussian Restoration
Boyuan Wang
Runqi Ouyang
Xiaofeng Wang
Zheng Zhu
Guosheng Zhao
Chaojun Ni
Guan Huang
Lihong Liu
Xingang Wang
3DGS
66
0
0
04 Apr 2025
MAD: Makeup All-in-One with Cross-Domain Diffusion Model
Bo-Kai Ruan
Hong-Han Shuai
DiffM
34
0
0
03 Apr 2025
Autonomous Human-Robot Interaction via Operator Imitation
Sammy Christen
David Müller
Agon Serifi
Ruben Grandia
Georg Wiedebach
Michael A. Hopkins
Espen Knoop
Moritz Bächer
LM&Ro
49
0
0
03 Apr 2025
Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation
Fa-Ting Hong
Zunnan Xu
Zixiang Zhou
Jun Zhou
Xiu Li
Qin Lin
Qinglin Lu
D. Xu
DiffM
VGen
51
2
0
03 Apr 2025
RBT4DNN: Requirements-based Testing of Neural Networks
Nusrat Jahan Mozumder
Felipe Toledo
Swaroopa Dola
Matthew B. Dwyer
AAML
46
1
0
03 Apr 2025
Exploration-Driven Generative Interactive Environments
N. Savov
Naser Kazemi
Mohammad Mahdi
Danda Pani Paudel
Xi Wang
Luc Van Gool
VGen
3DV
38
0
0
03 Apr 2025
All-day Depth Completion via Thermal-LiDAR Fusion
Janghyun Kim
Minseong Kweon
Jinsun Park
Ukcheol Shin
VLM
37
0
0
03 Apr 2025
Towards Assessing Deep Learning Test Input Generators
Seif Mzoughi
Ahmed Hajyahmed
Mohamed Elshafei
Foutse Khomh anb Diego Elias Costa
D. Costa
AAML
35
0
0
03 Apr 2025
WonderTurbo: Generating Interactive 3D World in 0.72 Seconds
Chaojun Ni
Xiaofeng Wang
Zheng Zhu
W. Wang
Haoyun Li
Guosheng Zhao
Jie Li
Wenkang Qin
Guan Huang
Wenjun Mei
3DGS
ViT
VGen
99
1
0
03 Apr 2025
DiSRT-In-Bed: Diffusion-Based Sim-to-Real Transfer Framework for In-Bed Human Mesh Recovery
Jing Gao
Ce Zheng
László A. Jeni
Zackory Erickson
3DH
37
0
0
03 Apr 2025
Fine-Tuning Visual Autoregressive Models for Subject-Driven Generation
Jiwoo Chung
Sangeek Hyun
Hyunjun Kim
Eunseo Koh
MinKyu Lee
Jae-Pil Heo
33
0
0
03 Apr 2025
SkyReels-A2: Compose Anything in Video Diffusion Transformers
Zhengcong Fei
D. Li
Di Qiu
J. Wang
Yikun Dou
...
J. Xu
Mingyuan Fan
Guibin Chen
Yang Li
Yahui Zhou
DiffM
VGen
63
2
0
03 Apr 2025
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
Zhiyuan Yan
Junyan Ye
Weijia Li
Zilong Huang
Shenghai Yuan
Xiangyang He
Kaiqing Lin
Jun-Jian He
Conghui He
Li Yuan
MLLM
EGVM
88
8
0
03 Apr 2025
Unified World Models: Coupling Video and Action Diffusion for Pretraining on Large Robotic Datasets
Chuning Zhu
Raymond Yu
S. Feng
Benjamin Burchfiel
Paarth Shah
Abhishek Gupta
VGen
55
0
0
03 Apr 2025
X-Capture: An Open-Source Portable Device for Multi-Sensory Learning
Samuel Clarke
Suzannah Wistreich
Yanjie Ze
Jiajun Wu
31
0
0
03 Apr 2025
Comprehensive Relighting: Generalizable and Consistent Monocular Human Relighting and Harmonization
J. Wang
Jingyuan Liu
Xin Sun
Krishna Kumar Singh
Zhixin Shu
...
Nanxuan Zhao
Tuanfeng Y. Wang
Simon Chen
Ulrich Neumann
Jae Shin Yoon
27
0
0
03 Apr 2025
F-ViTA: Foundation Model Guided Visible to Thermal Translation
Jay N. Paranjape
C. D. Melo
Vishal M. Patel
VGen
39
0
0
03 Apr 2025
Efficient Autoregressive Shape Generation via Octree-Based Adaptive Tokenization
Kangle Deng
Hsueh-Ti Derek Liu
Yiheng Zhu
Xiaoxia Sun
Chong Shang
Kiran Bhat
Deva Ramanan
Jun-Yan Zhu
Maneesh Agrawala
Tinghui Zhou
75
0
0
03 Apr 2025
Scene Splatter: Momentum 3D Scene Generation from Single Image with Video Diffusion Model
Shengjun Zhang
Jinzhao Li
Xin Fei
Hao Liu
Yueqi Duan
DiffM
3DGS
VGen
70
0
0
03 Apr 2025
VARGPT-v1.1: Improve Visual Autoregressive Large Unified Model via Iterative Instruction Tuning and Reinforcement Learning
Xianwei Zhuang
Yuxin Xie
Yufan Deng
Dongchao Yang
Liming Liang
Jinghan Ru
Yuguo Yin
Yuexian Zou
68
1
0
03 Apr 2025
OmniCam: Unified Multimodal Video Generation via Camera Control
Xiaoda Yang
Jiayang Xu
Kaixuan Luan
Xinyu Zhan
Hongshun Qiu
...
Shuai Yang
Li Zhang
Checheng Yu
Cewu Lu
Lixin Yang
DiffM
VGen
62
0
0
03 Apr 2025
Concept Lancet: Image Editing with Compositional Representation Transplant
Jinqi Luo
Tianjiao Ding
Kwan Ho Ryan Chan
Hancheng Min
Chris Callison-Burch
René Vidal
DiffM
KELM
72
0
0
03 Apr 2025
Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing
Xiangyu Zhao
Peiyuan Zhang
Kexian Tang
Hao Li
Zicheng Zhang
Guangtao Zhai
Junchi Yan
Hua Yang
Xue Yang
Haodong Duan
VLM
LRM
41
0
0
03 Apr 2025
MD-ProjTex: Texturing 3D Shapes with Multi-Diffusion Projection
Ahmet Burak Yildirim
Mustafa Utku Aydogdu
Duygu Ceylan
Aysegül Dündar
DiffM
45
1
0
03 Apr 2025
Enhanced Diffusion Sampling via Extrapolation with Multiple ODE Solutions
Jinyoung Choi
Junoh Kang
Bohyung Han
35
0
0
02 Apr 2025
Random Conditioning with Distillation for Data-Efficient Diffusion Model Compression
Dohyun Kim
S. Park
Geonhee Han
Seung Wook Kim
Paul Hongsuck Seo
DiffM
47
0
0
02 Apr 2025
Pro-DG: Procedural Diffusion Guidance for Architectural Facade Generation
Aleksander Plocharski
Jan Swidzinski
Przemyslaw Musialski
DiffM
30
0
0
02 Apr 2025
3DBonsai: Structure-Aware Bonsai Modeling Using Conditioned 3D Gaussian Splatting
Hao Wu
Hao Wang
Ruochong Li
Xuran Ma
Hui Xiong
37
0
0
02 Apr 2025
Domain Guidance: A Simple Transfer Approach for a Pre-trained Diffusion Model
Jincheng Zhong
Xiangcheng Zhang
J. Z. Wang
Mingsheng Long
35
1
0
02 Apr 2025
Implicit Bias Injection Attacks against Text-to-Image Diffusion Models
Huayang Huang
Xiangye Jin
Jiaxu Miao
Yu Wu
29
0
0
02 Apr 2025
A
T
^\text{T}
T
A: Adaptive Transformation Agent for Text-Guided Subject-Position Variable Background Inpainting
Yizhe Tang
Zhimin Sun
Yuzhen Du
Ran Yi
Guangben Lu
T. Hu
Luying Li
Lizhuang Ma
Fangyuan Zou
DiffM
35
0
0
02 Apr 2025
DreamActor-M1: Holistic, Expressive and Robust Human Image Animation with Hybrid Guidance
Yuxuan Luo
Zhengkun Rong
Lizhen Wang
Longhao Zhang
Tianshu Hu
Yongming Zhu
VGen
95
0
0
02 Apr 2025
ILLUME+: Illuminating Unified MLLM with Dual Visual Tokenization and Diffusion Refinement
Runhui Huang
Chunwei Wang
Junwei Yang
Guansong Lu
Yunlong Yuan
...
Lu Hou
Wei Zhang
Lanqing Hong
Hengshuang Zhao
Hang Xu
MLLM
81
1
0
02 Apr 2025
All Patches Matter, More Patches Better: Enhance AI-Generated Image Detection via Panoptic Patch Learning
Zheng Yang
Ruoxin Chen
Zhiyuan Yan
Ke-Yue Zhang
Xinghe Fu
...
Xiujun Shu
Taiping Yao
Junchi Yan
Shouhong Ding
Xi Li
29
0
0
02 Apr 2025
Previous
1
2
3
...
9
10
11
...
158
159
160
Next