Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.01952
Cited By
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
4 July 2023
Dustin Podell
Zion English
Kyle Lacey
A. Blattmann
Tim Dockhorn
Jonas Muller
Joe Penna
Robin Rombach
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis"
50 / 1,616 papers shown
Title
Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment
Rui Yang
Xiaoman Pan
Feng Luo
Shuang Qiu
Han Zhong
Dong Yu
Jianshu Chen
95
66
0
15 Feb 2024
GES: Generalized Exponential Splatting for Efficient Radiance Field Rendering
Abdullah Hamdi
Luke Melas-Kyriazi
Jinjie Mai
Guocheng Qian
Ruoshi Liu
Carl Vondrick
Bernard Ghanem
Andrea Vedaldi
3DGS
17
44
0
15 Feb 2024
Magic-Me: Identity-Specific Video Customized Diffusion
Ze Ma
Daquan Zhou
Chun-Hsiao Yeh
Xue-She Wang
Xiuyu Li
Huanrui Yang
Zhen Dong
Kurt Keutzer
Jiashi Feng
VGen
DiffM
32
31
0
14 Feb 2024
DoRA: Weight-Decomposed Low-Rank Adaptation
Shih-yang Liu
Chien-Yi Wang
Hongxu Yin
Pavlo Molchanov
Yu-Chiang Frank Wang
Kwang-Ting Cheng
Min-Hung Chen
22
337
0
14 Feb 2024
L3GO: Language Agents with Chain-of-3D-Thoughts for Generating Unconventional Objects
Yutaro Yamada
Khyathi Raghavi Chandu
Yuchen Lin
Jack Hessel
Ilker Yildirim
Yejin Choi
AI4CE
23
12
0
14 Feb 2024
IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation
Luke Melas-Kyriazi
Iro Laina
Christian Rupprecht
Natalia Neverova
Andrea Vedaldi
Oran Gafni
Filippos Kokkinos
3DGS
16
64
0
13 Feb 2024
A Dense Reward View on Aligning Text-to-Image Diffusion with Preference
Shentao Yang
Tianqi Chen
Mingyuan Zhou
EGVM
30
22
0
13 Feb 2024
Discovering Universal Semantic Triggers for Text-to-Image Synthesis
Shengfang Zhai
Weilong Wang
Jiajun Li
Yinpeng Dong
Hang Su
Qingni Shen
EGVM
31
3
0
12 Feb 2024
AvatarMMC: 3D Head Avatar Generation and Editing with Multi-Modal Conditioning
W. Para
Abdelrahman Eldesokey
Zhenyu Li
Pradyumna Reddy
Jiankang Deng
Peter Wonka
DiffM
30
0
0
08 Feb 2024
SPAD : Spatially Aware Multiview Diffusers
Yash Kant
Ziyi Wu
Michael Vasilkovsky
Guocheng Qian
Jian Ren
R. A. Guler
Bernard Ghanem
Sergey Tulyakov
Igor Gilitschenski
Aliaksandr Siarohin
DiffM
22
34
0
07 Feb 2024
Fast Timing-Conditioned Latent Audio Diffusion
Zach Evans
CJ Carr
Josiah Taylor
Scott H. Hawley
Jordi Pons
DiffM
74
101
0
07 Feb 2024
Noise Map Guidance: Inversion with Spatial Context for Real Image Editing
Hansam Cho
Jonghyun Lee
Seoung Bum Kim
Tae-Hyun Oh
Yonghyun Jeong
DiffM
15
15
0
07 Feb 2024
ColorSwap: A Color and Word Order Dataset for Multimodal Evaluation
Jirayu Burapacheep
Ishan Gaur
Agam Bhatia
Tristan Thrush
24
4
0
07 Feb 2024
ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation
Weiming Ren
Harry Yang
Ge Zhang
Cong Wei
Xinrun Du
Stephen W. Huang
Wenhu Chen
DiffM
VGen
76
53
0
06 Feb 2024
EscherNet: A Generative Model for Scalable View Synthesis
Xin Kong
Shikun Liu
Xiaoyang Lyu
Marwan Taher
Xiaojuan Qi
Andrew J. Davison
DiffM
78
42
0
06 Feb 2024
An Inpainting-Infused Pipeline for Attire and Background Replacement
F. Mahlow
A. F. Zanella
William Alberto Cruz-Castaneda
Marcellus Amadeus
25
0
0
05 Feb 2024
Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization
Yang Jin
Zhicheng Sun
Kun Xu
Kun Xu
Liwei Chen
...
Yuliang Liu
Di Zhang
Yang Song
Kun Gai
Yadong Mu
VGen
47
42
0
05 Feb 2024
Character-based Outfit Generation with Vision-augmented Style Extraction via LLMs
Najmeh Forouzandehmehr
Yijie Cao
Nikhil Thakurdesai
Ramin Giahi
Luyi Ma
Nima Farrokhsiar
Jianpeng Xu
Evren Körpeoglu
Kannan Achan
28
2
0
02 Feb 2024
AI-generated faces influence gender stereotypes and racial homogenization
Nouar Aldahoul
Talal Rahwan
Yasir Zaki
27
2
0
01 Feb 2024
Diffusion Facial Forgery Detection
Harry Cheng
Yangyang Guo
Tianyi Wang
L. Nie
Mohan S. Kankanhalli
56
16
0
29 Jan 2024
FreeStyle: Free Lunch for Text-guided Style Transfer using Diffusion Models
Feihong He
Gang Li
Mengyuan Zhang
Leilei Yan
Lingyu Si
Fanzhang Li
Li Shen
DiffM
23
15
0
28 Jan 2024
Taiyi-Diffusion-XL: Advancing Bilingual Text-to-Image Generation with Large Vision-Language Model Support
Xiaojun Wu
Di Zhang
Ruyi Gan
Junyu Lu
Ziwei Wu
Renliang Sun
Jiaxing Zhang
Pingjian Zhang
Yan Song
VLM
21
6
0
26 Jan 2024
pix2gestalt: Amodal Segmentation by Synthesizing Wholes
Ege Ozguroglu
Ruoshi Liu
Dídac Surís
Dian Chen
Achal Dave
P. Tokmakov
Carl Vondrick
DiffM
VLM
41
31
0
25 Jan 2024
StyleInject: Parameter Efficient Tuning of Text-to-Image Diffusion Models
Mohan Zhou
Yalong Bai
Qing Yang
Tiejun Zhao
24
0
0
25 Jan 2024
CreativeSynth: Cross-Art-Attention for Artistic Image Synthesis with Multimodal Diffusion
Nisha Huang
Weiming Dong
Yuxin Zhang
Fan Tang
Ronghui Li
Chongyang Ma
Xiu Li
Tong-Yee Lee
Changsheng Xu
DiffM
27
7
0
25 Jan 2024
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild
Fanghua Yu
Jinjin Gu
Zheyuan Li
Jinfan Hu
Xiangtao Kong
Xintao Wang
Jingwen He
Yu Qiao
Chao Dong
25
127
0
24 Jan 2024
UNIMO-G: Unified Image Generation through Multimodal Conditional Diffusion
Wei Li
Xue Xu
Jiachen Liu
Xinyan Xiao
20
5
0
24 Jan 2024
Towards Multi-domain Face Landmark Detection with Synthetic Data from Diffusion model
Yuanming Li
Gwantae Kim
Jeong-gi Kwak
B. Ku
Hanseok Ko
12
0
0
24 Jan 2024
CCA: Collaborative Competitive Agents for Image Editing
Tiankai Hang
Shuyang Gu
Dong Chen
Xin Geng
Baining Guo
20
5
0
23 Jan 2024
Benchmarking Large Multimodal Models against Common Corruptions
Jiawei Zhang
Tianyu Pang
Chao Du
Yi Ren
Bo-wen Li
Min-Bin Lin
MLLM
22
14
0
22 Jan 2024
Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs
Ling Yang
Zhaochen Yu
Chenlin Meng
Minkai Xu
Stefano Ermon
Bin Cui
CoGe
DiffM
30
114
0
22 Jan 2024
UltrAvatar: A Realistic Animatable 3D Avatar Diffusion Model with Authenticity Guided Textures
Mingyuan Zhou
Rakib Hyder
Ziwei Xuan
Guojun Qi
22
6
0
20 Jan 2024
DiffusionGPT: LLM-Driven Text-to-Image Generation System
Jie Qin
Jie Wu
Weifeng Chen
Yuxi Ren
Huixian Li
Hefeng Wu
Xuefeng Xiao
Rui Wang
S. Wen
DiffM
50
24
0
18 Jan 2024
Vlogger: Make Your Dream A Vlog
Shaobin Zhuang
Kunchang Li
Xinyuan Chen
Yaohui Wang
Ziwei Liu
Yu Qiao
Yali Wang
VGen
DiffM
22
34
0
17 Jan 2024
UniVG: Towards UNIfied-modal Video Generation
Ludan Ruan
Lei Tian
Chuanwei Huang
Xu Zhang
Xinyan Xiao
VGen
DiffM
23
3
0
17 Jan 2024
Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis
Jonghyun Lee
Hansam Cho
Youngjoon Yoo
Seoung Bum Kim
Yonghyun Jeong
DiffM
15
7
0
17 Jan 2024
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
Haoxin Chen
Yong Zhang
Xiaodong Cun
Menghan Xia
Xintao Wang
Chao-Liang Weng
Ying Shan
VGen
DiffM
115
274
0
17 Jan 2024
Adversarial Supervision Makes Layout-to-Image Diffusion Models Thrive
Yumeng Li
M. Keuper
Dan Zhang
Anna Khoreva
DiffM
35
10
0
16 Jan 2024
WAVES: Benchmarking the Robustness of Image Watermarks
Bang An
Mucong Ding
Tahseen Rabbani
Aakriti Agrawal
Yuancheng Xu
...
Sicheng Zhu
Abdirisak Mohamed
Yuxin Wen
Tom Goldstein
Furong Huang
20
40
0
16 Jan 2024
Instilling Multi-round Thinking to Text-guided Image Generation
Lidong Zeng
Zhedong Zheng
Yinwei Wei
Tat-Seng Chua
13
5
0
16 Jan 2024
HexaGen3D: StableDiffusion is just one step away from Fast and Diverse Text-to-3D Generation
Antoine Mercier
Ramin Nakhli
Mahesh Reddy
R. Yasarla
Hong Cai
Fatih Porikli
Guillaume Berger
DiffM
27
15
0
15 Jan 2024
InstantID: Zero-shot Identity-Preserving Generation in Seconds
Qixun Wang
Xu Bai
Haofan Wang
Zekui Qin
Anthony Chen
Huaxia Li
Xu Tang
Yao Hu
29
234
0
15 Jan 2024
PIXART-δ: Fast and Controllable Image Generation with Latent Consistency Models
Junsong Chen
Yue Wu
Simian Luo
Enze Xie
Sayak Paul
Ping Luo
Hang Zhao
Zhenguo Li
VLM
20
70
0
10 Jan 2024
Memory-Efficient Fine-Tuning for Quantized Diffusion Model
Hyogon Ryu
Seohyun Lim
Hyunjung Shim
DiffM
MQ
27
6
0
09 Jan 2024
GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation
Tong Wu
Guandao Yang
Zhibing Li
Kai Zhang
Ziwei Liu
Leonidas J. Guibas
Dahua Lin
Gordon Wetzstein
EGVM
VGen
23
88
0
08 Jan 2024
Instruct-Imagen: Image Generation with Multi-modal Instruction
Hexiang Hu
Kelvin C. K. Chan
Yu-Chuan Su
Wenhu Chen
Yandong Li
...
Xue Ben
Boqing Gong
William W. Cohen
Ming-Wei Chang
Xuhui Jia
MLLM
38
42
0
03 Jan 2024
Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions
David Junhao Zhang
Dongxu Li
Hung Le
Mike Zheng Shou
Caiming Xiong
Doyen Sahoo
VGen
14
23
0
03 Jan 2024
aMUSEd: An Open MUSE Reproduction
Suraj Patil
William Berman
Robin Rombach
Patrick von Platen
VLM
17
18
0
03 Jan 2024
SIGNeRF: Scene Integrated Generation for Neural Radiance Fields
Jan-Niklas Dihlmann
Andreas Engelhardt
Hendrik P. A. Lensch
DiffM
VGen
16
4
0
03 Jan 2024
ColorizeDiffusion: Adjustable Sketch Colorization with Reference Image and Text
Dingkun Yan
Liang Yuan
Erwin Wu
Yuma Nishioka
I. Fujishiro
Suguru Saito
DiffM
10
5
0
02 Jan 2024
Previous
1
2
3
...
27
28
29
...
31
32
33
Next