ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2307.01952
  4. Cited By
SDXL: Improving Latent Diffusion Models for High-Resolution Image
  Synthesis

SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis

4 July 2023
Dustin Podell
Zion English
Kyle Lacey
A. Blattmann
Tim Dockhorn
Jonas Muller
Joe Penna
Robin Rombach
ArXivPDFHTML

Papers citing "SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis"

50 / 1,616 papers shown
Title
Responsible Visual Editing
Responsible Visual Editing
Minheng Ni
Yeli Shen
Lei Zhang
W. Zuo
DiffM
27
0
0
08 Apr 2024
DiffCJK: Conditional Diffusion Model for High-Quality and Wide-coverage
  CJK Character Generation
DiffCJK: Conditional Diffusion Model for High-Quality and Wide-coverage CJK Character Generation
Yingtao Tian
DiffM
19
0
0
08 Apr 2024
ByteEdit: Boost, Comply and Accelerate Generative Image Editing
ByteEdit: Boost, Comply and Accelerate Generative Image Editing
Yuxi Ren
Jie Wu
Yanzuo Lu
Huafeng Kuang
Xin Xia
...
Shiyin Wang
Xuefeng Xiao
Yitong Wang
Min Zheng
Lean Fu
29
5
0
07 Apr 2024
ShoeModel: Learning to Wear on the User-specified Shoes via Diffusion
  Model
ShoeModel: Learning to Wear on the User-specified Shoes via Diffusion Model
Binghui Chen
Wenyu Li
Yifeng Geng
Xuansong Xie
Wangmeng Zuo
DiffM
35
3
0
07 Apr 2024
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
Shenghai Yuan
Jinfa Huang
Yujun Shi
Yongqi Xu
Ruijie Zhu
Bin Lin
Xinhua Cheng
Li-xin Yuan
Jiebo Luo
VGen
75
33
0
07 Apr 2024
BeyondScene: Higher-Resolution Human-Centric Scene Generation With
  Pretrained Diffusion
BeyondScene: Higher-Resolution Human-Centric Scene Generation With Pretrained Diffusion
Gwanghyun Kim
Hayeon Kim
H. Seo
Dong un Kang
Se Young Chun
38
4
0
06 Apr 2024
Aligning Diffusion Models by Optimizing Human Utility
Aligning Diffusion Models by Optimizing Human Utility
Shufan Li
Konstantinos Kallidromitis
Akash Gokul
Yusuke Kato
Kazuki Kozuka
105
27
0
06 Apr 2024
Idea-2-3D: Collaborative LMM Agents Enable 3D Model Generation from
  Interleaved Multimodal Inputs
Idea-2-3D: Collaborative LMM Agents Enable 3D Model Generation from Interleaved Multimodal Inputs
Junhao Chen
Xiang Li
Xiaojun Ye
Chao Li
Zhaoxin Fan
Hao Zhao
VGen
3DV
200
4
0
05 Apr 2024
Who Evaluates the Evaluations? Objectively Scoring Text-to-Image Prompt
  Coherence Metrics with T2IScoreScore (TS2)
Who Evaluates the Evaluations? Objectively Scoring Text-to-Image Prompt Coherence Metrics with T2IScoreScore (TS2)
Michael Stephen Saxon
Fatima Jahara
Mahsa Khoshnoodi
Yujie Lu
Aditya Sharma
William Yang Wang
EGVM
28
9
0
05 Apr 2024
Identity Decoupling for Multi-Subject Personalization of Text-to-Image
  Models
Identity Decoupling for Multi-Subject Personalization of Text-to-Image Models
Sang-Sub Jang
Jaehyeong Jo
Kimin Lee
Sung Ju Hwang
21
15
0
05 Apr 2024
CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept
  Matching
CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching
Dongzhi Jiang
Guanglu Song
Xiaoshi Wu
Renrui Zhang
Dazhong Shen
Zhuofan Zong
Yu Liu
Hongsheng Li
VLM
30
20
0
04 Apr 2024
No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency
  Determines Multimodal Model Performance
No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance
Vishaal Udandarao
Ameya Prabhu
Adhiraj Ghosh
Yash Sharma
Philip H. S. Torr
Adel Bibi
Samuel Albanie
Matthias Bethge
VLM
126
44
0
04 Apr 2024
LCM-Lookahead for Encoder-based Text-to-Image Personalization
LCM-Lookahead for Encoder-based Text-to-Image Personalization
Rinon Gal
Or Lichter
Elad Richardson
Or Patashnik
Amit H. Bermano
Gal Chechik
Daniel Cohen-Or
DiffM
38
29
0
04 Apr 2024
Gen3DSR: Generalizable 3D Scene Reconstruction via Divide and Conquer from a Single View
Gen3DSR: Generalizable 3D Scene Reconstruction via Divide and Conquer from a Single View
Andreea Dogaru
M. Ozer
Bernhard Egger
3DGS
59
4
0
04 Apr 2024
LidarDM: Generative LiDAR Simulation in a Generated World
LidarDM: Generative LiDAR Simulation in a Generated World
Vlas Zyrianov
Henry Che
Zhijian Liu
Shenlong Wang
VGen
30
20
0
03 Apr 2024
MULAN: A Multi Layer Annotated Dataset for Controllable Text-to-Image
  Generation
MULAN: A Multi Layer Annotated Dataset for Controllable Text-to-Image Generation
Petru-Daniel Tudosiu
Yongxin Yang
Shifeng Zhang
Fei Chen
Steven G. McDonagh
Gerasimos Lampouras
Ignacio Iacobacci
Sarah Parisot
37
10
0
03 Apr 2024
InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image
  Generation
InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation
Haofan Wang
Matteo Spinelli
Qixun Wang
Xu Bai
Zekui Qin
Anthony Chen
DiffM
44
84
0
03 Apr 2024
Faster Diffusion via Temporal Attention Decomposition
Faster Diffusion via Temporal Attention Decomposition
Haozhe Liu
Wentian Zhang
Jinheng Xie
Francesco Faccio
Mengmeng Xu
Tao Xiang
Mike Zheng Shou
Juan-Manuel Perez-Rua
Jürgen Schmidhuber
DiffM
67
19
0
03 Apr 2024
Jailbreaking Prompt Attack: A Controllable Adversarial Attack against
  Diffusion Models
Jailbreaking Prompt Attack: A Controllable Adversarial Attack against Diffusion Models
Jiachen Ma
Anda Cao
Zhiqing Xiao
Jie Zhang
Chaonan Ye
Junbo Zhao
16
29
0
02 Apr 2024
Upsample Guidance: Scale Up Diffusion Models without Training
Upsample Guidance: Scale Up Diffusion Models without Training
Juno Hwang
Yong-Hyun Park
Junghyo Jo
32
12
0
02 Apr 2024
Bigger is not Always Better: Scaling Properties of Latent Diffusion
  Models
Bigger is not Always Better: Scaling Properties of Latent Diffusion Models
Kangfu Mei
Zhengzhong Tu
M. Delbracio
Hossein Talebi
Vishal M. Patel
P. Milanfar
DiffM
50
12
0
01 Apr 2024
CosmicMan: A Text-to-Image Foundation Model for Humans
CosmicMan: A Text-to-Image Foundation Model for Humans
Shikai Li
Jianglin Fu
Kaiyuan Liu
Wentao Wang
Kwan-Yee Lin
Wayne Wu
DiffM
35
19
0
01 Apr 2024
Getting it Right: Improving Spatial Consistency in Text-to-Image Models
Getting it Right: Improving Spatial Consistency in Text-to-Image Models
Agneet Chatterjee
Gabriela Ben-Melech Stan
Estelle Aflalo
Sayak Paul
Dhruba Ghosh
...
Ludwig Schmidt
Hanna Hajishirzi
Vasudev Lal
Chitta Baral
Yezhou Yang
EGVM
VLM
59
14
0
01 Apr 2024
DiffAgent: Fast and Accurate Text-to-Image API Selection with Large
  Language Model
DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language Model
Lirui Zhao
Yue Yang
Kaipeng Zhang
Wenqi Shao
Yuxin Zhang
Yu Qiao
Ping Luo
Rongrong Ji
LM&Ro
LLMAG
VLM
29
3
0
31 Mar 2024
U-VAP: User-specified Visual Appearance Personalization via Decoupled
  Self Augmentation
U-VAP: User-specified Visual Appearance Personalization via Decoupled Self Augmentation
You Wu
Kean Liu
Xiaoyue Mi
Fan Tang
Juan Cao
Jintao Li
DiffM
29
4
0
29 Mar 2024
FreeSeg-Diff: Training-Free Open-Vocabulary Segmentation with Diffusion
  Models
FreeSeg-Diff: Training-Free Open-Vocabulary Segmentation with Diffusion Models
Barbara Toniella Corradini
Mustafa Shukor
Paul Couairon
Guillaume Couairon
Franco Scarselli
Matthieu Cord
DiffM
VLM
42
4
0
29 Mar 2024
Enhancing Efficiency in Vision Transformer Networks: Design Techniques
  and Insights
Enhancing Efficiency in Vision Transformer Networks: Design Techniques and Insights
Moein Heidari
Reza Azad
Sina Ghorbani Kolahi
René Arimond
Leon Niggemeier
...
Afshin Bozorgpour
Ehsan Khodapanah Aghdam
A. Kazerouni
I. Hacihaliloglu
Dorit Merhof
41
7
0
28 Mar 2024
Imperceptible Protection against Style Imitation from Diffusion Models
Imperceptible Protection against Style Imitation from Diffusion Models
Namhyuk Ahn
Wonhyuk Ahn
Kiyoon Yoo
Daesik Kim
Seung-Hun Nam
WIGM
AAML
DiffM
46
5
0
28 Mar 2024
TextCraftor: Your Text Encoder Can be Image Quality Controller
TextCraftor: Your Text Encoder Can be Image Quality Controller
Yanyu Li
Xian Liu
Anil Kag
Ju Hu
Yerlan Idelbayev
Dhritiman Sagar
Yanzhi Wang
Sergey Tulyakov
Jian Ren
45
14
0
27 Mar 2024
Mini-Gemini: Mining the Potential of Multi-modality Vision Language
  Models
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models
Yanwei Li
Yuechen Zhang
Chengyao Wang
Zhisheng Zhong
Yixin Chen
Ruihang Chu
Shaoteng Liu
Jiaya Jia
VLM
MLLM
MoE
37
211
0
27 Mar 2024
Capability-aware Prompt Reformulation Learning for Text-to-Image
  Generation
Capability-aware Prompt Reformulation Learning for Text-to-Image Generation
Jingtao Zhan
Qingyao Ai
Yiqun Liu
Jia Chen
Shaoping Ma
DiffM
47
4
0
27 Mar 2024
InstructBrush: Learning Attention-based Instruction Optimization for
  Image Editing
InstructBrush: Learning Attention-based Instruction Optimization for Image Editing
Ruoyu Zhao
Qingnan Fan
Fei Kou
Shuai Qin
Hong Gu
Wei Wu
Pengcheng Xu
Mingrui Zhu
Nannan Wang
Xinbo Gao
35
4
0
27 Mar 2024
VersaT2I: Improving Text-to-Image Models with Versatile Reward
VersaT2I: Improving Text-to-Image Models with Versatile Reward
Jianshu Guo
Wenhao Chai
Jie Deng
Hsiang-Wei Huang
Tianbo Ye
Yichen Xu
Jiawei Zhang
Jenq-Neng Hwang
Gaoang Wang
VLM
43
15
0
27 Mar 2024
Ship in Sight: Diffusion Models for Ship-Image Super Resolution
Ship in Sight: Diffusion Models for Ship-Image Super Resolution
Luigi Sigillo
R. F. Gramaccioni
Alessandro Nicolosi
Danilo Comminiello
21
3
0
27 Mar 2024
AID: Attention Interpolation of Text-to-Image Diffusion
AID: Attention Interpolation of Text-to-Image Diffusion
Qiyuan He
Jinghao Wang
Ziwei Liu
Angela Yao
DiffM
32
9
0
26 Mar 2024
Improving Text-to-Image Consistency via Automatic Prompt Optimization
Improving Text-to-Image Consistency via Automatic Prompt Optimization
Oscar Manas
Pietro Astolfi
Melissa Hall
Candace Ross
Jack Urbanek
Adina Williams
Aishwarya Agrawal
Adriana Romero Soriano
M. Drozdzal
34
27
0
26 Mar 2024
GenesisTex: Adapting Image Denoising Diffusion to Texture Space
GenesisTex: Adapting Image Denoising Diffusion to Texture Space
Chenjian Gao
Boyan Jiang
Xinghui Li
Yingpeng Zhang
Qian Yu
DiffM
44
8
0
26 Mar 2024
DreamPolisher: Towards High-Quality Text-to-3D Generation via Geometric
  Diffusion
DreamPolisher: Towards High-Quality Text-to-3D Generation via Geometric Diffusion
Yuanze Lin
Ronald Clark
Philip H. S. Torr
3DGS
35
10
0
25 Mar 2024
FlashFace: Human Image Personalization with High-fidelity Identity
  Preservation
FlashFace: Human Image Personalization with High-fidelity Identity Preservation
Shilong Zhang
Lianghua Huang
Xi Chen
Yifei Zhang
Zhigang Wu
Yutong Feng
Wei Wang
Yujun Shen
Yu Liu
Ping Luo
43
17
0
25 Mar 2024
TRIP: Temporal Residual Learning with Image Noise Prior for
  Image-to-Video Diffusion Models
TRIP: Temporal Residual Learning with Image Noise Prior for Image-to-Video Diffusion Models
Zhongwei Zhang
Fuchen Long
Yingwei Pan
Zhaofan Qiu
Ting Yao
Yang Cao
Tao Mei
VGen
43
22
0
25 Mar 2024
Be Yourself: Bounded Attention for Multi-Subject Text-to-Image
  Generation
Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation
Omer Dahary
Or Patashnik
Kfir Aberman
Daniel Cohen-Or
DiffM
29
28
0
25 Mar 2024
Make-It-Vivid: Dressing Your Animatable Biped Cartoon Characters from
  Text
Make-It-Vivid: Dressing Your Animatable Biped Cartoon Characters from Text
Junshu Tang
Yanhong Zeng
Ke Fan
Xuheng Wang
Bo Dai
Kai Chen
Lizhuang Ma
18
7
0
25 Mar 2024
UrbanVLP: Multi-Granularity Vision-Language Pretraining for Urban Socioeconomic Indicator Prediction
UrbanVLP: Multi-Granularity Vision-Language Pretraining for Urban Socioeconomic Indicator Prediction
Xixuan Hao
Wei Chen
Yibo Yan
Siru Zhong
Kun Wang
Qingsong Wen
Yuxuan Liang
VLM
79
1
0
25 Mar 2024
SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions
SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions
Yuda Song
Zehao Sun
Xuanwu Yin
VLM
38
16
0
25 Mar 2024
ROXIE: Defining a Robotic eXplanation and Interpretability Engine
ROXIE: Defining a Robotic eXplanation and Interpretability Engine
Francisco J. Rodríguez-Lera
Miguel Ángel González Santamarta
Alejandro González-Cantón
Laura Fernández-Becerra
David Sobrín-Hidalgo
Ángel Manuel Guerrero Higueras
31
0
0
25 Mar 2024
An Intermediate Fusion ViT Enables Efficient Text-Image Alignment in
  Diffusion Models
An Intermediate Fusion ViT Enables Efficient Text-Image Alignment in Diffusion Models
Zizhao Hu
Shaochong Jia
Mohammad Rostami
38
1
0
25 Mar 2024
Continuous, Subject-Specific Attribute Control in T2I Models by Identifying Semantic Directions
Continuous, Subject-Specific Attribute Control in T2I Models by Identifying Semantic Directions
S. A. Baumann
Felix Krause
Michael Neumayr
Nick Stracke
Vincent Tao Hu
Bjorn Ommer
Björn Ommer
DiffM
LM&Ro
70
11
0
25 Mar 2024
latentSplat: Autoencoding Variational Gaussians for Fast Generalizable
  3D Reconstruction
latentSplat: Autoencoding Variational Gaussians for Fast Generalizable 3D Reconstruction
Christopher Wewer
Kevin Raj
Eddy Ilg
Bernt Schiele
J. E. Lenssen
3DGS
56
59
0
24 Mar 2024
Skull-to-Face: Anatomy-Guided 3D Facial Reconstruction and Editing
Skull-to-Face: Anatomy-Guided 3D Facial Reconstruction and Editing
Yongqing Liang
Congyi Zhang
Junli Zhao
Wenping Wang
Xin Li
3DH
38
2
0
24 Mar 2024
A Unified Module for Accelerating STABLE-DIFFUSION: LCM-LORA
A Unified Module for Accelerating STABLE-DIFFUSION: LCM-LORA
Ayush Thakur
Rashmi Vashisth
MoMe
27
2
0
24 Mar 2024
Previous
123...242526...313233
Next