ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.03869
  4. Cited By
Harnessing the Spatial-Temporal Attention of Diffusion Models for
  High-Fidelity Text-to-Image Synthesis

Harnessing the Spatial-Temporal Attention of Diffusion Models for High-Fidelity Text-to-Image Synthesis

7 April 2023
Qiucheng Wu
Yujian Liu
Handong Zhao
T. Bui
Zhe-nan Lin
Yang Zhang
Shiyu Chang
    DiffM
ArXivPDFHTML

Papers citing "Harnessing the Spatial-Temporal Attention of Diffusion Models for High-Fidelity Text-to-Image Synthesis"

44 / 44 papers shown
Title
DiffVQA: Video Quality Assessment Using Diffusion Feature Extractor
DiffVQA: Video Quality Assessment Using Diffusion Feature Extractor
Wei-Ting Chen
Yu-Jiet Vong
Yi-Tsung Lee
Sy-Yen Kuo
Qiang Gao
Sizhuo Ma
Jian Wang
79
0
0
06 May 2025
Hierarchical and Step-Layer-Wise Tuning of Attention Specialty for Multi-Instance Synthesis in Diffusion Transformers
Hierarchical and Step-Layer-Wise Tuning of Attention Specialty for Multi-Instance Synthesis in Diffusion Transformers
Chunyang Zhang
Zhenhong Sun
Zhicheng Zhang
Junyan Wang
Yu Zhang
Dong Gong
H. Mo
Daoyi Dong
33
0
0
14 Apr 2025
AP-CAP: Advancing High-Quality Data Synthesis for Animal Pose Estimation via a Controllable Image Generation Pipeline
AP-CAP: Advancing High-Quality Data Synthesis for Animal Pose Estimation via a Controllable Image Generation Pipeline
Lei Wang
Yujie Zhong
Xiaopeng Sun
Jingchun Cheng
C. Feng
Qiong Cao
Lin Ma
Zhaoxin Fan
41
0
0
01 Apr 2025
Spatial Transport Optimization by Repositioning Attention Map for Training-Free Text-to-Image Synthesis
Spatial Transport Optimization by Repositioning Attention Map for Training-Free Text-to-Image Synthesis
Woojung Han
Yeonkyung Lee
Chanyoung Kim
Kwanghyun Park
Seong Jae Hwang
DiffM
60
0
0
28 Mar 2025
ToLo: A Two-Stage, Training-Free Layout-To-Image Generation Framework For High-Overlap Layouts
Linhao Huang
Jing Yu
DiffM
40
0
0
03 Mar 2025
Isolated Diffusion: Optimizing Multi-Concept Text-to-Image Generation Training-Freely with Isolated Diffusion Guidance
Isolated Diffusion: Optimizing Multi-Concept Text-to-Image Generation Training-Freely with Isolated Diffusion Guidance
Jin Zhu
Huimin Ma
Jiansheng Chen
Jian Yuan
71
4
0
20 Jan 2025
Leapfrog Latent Consistency Model (LLCM) for Medical Images Generation
Leapfrog Latent Consistency Model (LLCM) for Medical Images Generation
Lakshmikar R. Polamreddy
Kalyan Roy
Sheng-Han Yueh
Deepshikha Mahato
Shilpa Kuppili
Jialu Li
Youshan Zhang
MedIm
80
1
0
22 Nov 2024
Token Merging for Training-Free Semantic Binding in Text-to-Image
  Synthesis
Token Merging for Training-Free Semantic Binding in Text-to-Image Synthesis
Taihang Hu
Linxuan Li
Joost van de Weijer
Hongcheng Gao
Fahad Shahbaz Khan
Jian Yang
Ming-Ming Cheng
Kai Wang
Yaxing Wang
DiffM
43
4
0
11 Nov 2024
CompGS: Unleashing 2D Compositionality for Compositional Text-to-3D via
  Dynamically Optimizing 3D Gaussians
CompGS: Unleashing 2D Compositionality for Compositional Text-to-3D via Dynamically Optimizing 3D Gaussians
Chongjian Ge
Chenfeng Xu
Yuanfeng Ji
C-T.John Peng
M. Tomizuka
Ping Luo
Mingyu Ding
Varun Jampani
W. Zhan
3DGS
32
4
0
28 Oct 2024
Attention Overlap Is Responsible for The Entity Missing Problem in Text-to-image Diffusion Models!
Attention Overlap Is Responsible for The Entity Missing Problem in Text-to-image Diffusion Models!
Arash Marioriyad
Mohammadali Banayeeanzade
Reza Abbasi
M. Rohban
M. Baghshah
DiffM
67
3
0
28 Oct 2024
ABHINAW: A method for Automatic Evaluation of Typography within
  AI-Generated Images
ABHINAW: A method for Automatic Evaluation of Typography within AI-Generated Images
Abhinaw Jagtap
Nachiket Tapas
R. G. Brajesh
EGVM
18
0
0
18 Sep 2024
Optimizing Resource Consumption in Diffusion Models through
  Hallucination Early Detection
Optimizing Resource Consumption in Diffusion Models through Hallucination Early Detection
Federico Betti
Lorenzo Baraldi
Lorenzo Baraldi
Rita Cucchiara
N. Sebe
DiffM
31
0
0
16 Sep 2024
Mixed-View Panorama Synthesis using Geospatially Guided Diffusion
Mixed-View Panorama Synthesis using Geospatially Guided Diffusion
Zhexiao Xiong
Xin Xing
Scott Workman
Subash Khanal
Nathan Jacobs
DiffM
MDE
52
1
0
12 Jul 2024
Boosting Consistency in Story Visualization with Rich-Contextual
  Conditional Diffusion Models
Boosting Consistency in Story Visualization with Rich-Contextual Conditional Diffusion Models
Fei Shen
Hu Ye
Sibo Liu
Jun Zhang
Cong Wang
Xiao Han
Wei Yang
87
33
0
02 Jul 2024
AlignIT: Enhancing Prompt Alignment in Customization of Text-to-Image
  Models
AlignIT: Enhancing Prompt Alignment in Customization of Text-to-Image Models
Aishwarya Agarwal
Srikrishna Karanam
Balaji Vasan Srinivasan
29
1
0
27 Jun 2024
Compositional Video Generation as Flow Equalization
Compositional Video Generation as Flow Equalization
Xingyi Yang
Xinchao Wang
DiffM
VGen
55
7
0
10 Jun 2024
Information Theoretic Text-to-Image Alignment
Information Theoretic Text-to-Image Alignment
Chao Wang
Giulio Franzese
A. Finamore
Massimo Gallo
Pietro Michiardi
58
0
0
31 May 2024
Obtaining Favorable Layouts for Multiple Object Generation
Obtaining Favorable Layouts for Multiple Object Generation
Barak Battash
Amit Rozner
Lior Wolf
Ofir Lindenbaum
DiffM
35
2
0
01 May 2024
CLoRA: A Contrastive Approach to Compose Multiple LoRA Models
CLoRA: A Contrastive Approach to Compose Multiple LoRA Models
Tuna Han Salih Meral
Enis Simsar
Federico Tombari
Pinar Yanardag
MoMe
23
0
0
28 Mar 2024
Improving Text-to-Image Consistency via Automatic Prompt Optimization
Improving Text-to-Image Consistency via Automatic Prompt Optimization
Oscar Manas
Pietro Astolfi
Melissa Hall
Candace Ross
Jack Urbanek
Adina Williams
Aishwarya Agrawal
Adriana Romero Soriano
M. Drozdzal
29
26
0
26 Mar 2024
XReal: Realistic Anatomy and Pathology-Aware X-ray Generation via
  Controllable Diffusion Model
XReal: Realistic Anatomy and Pathology-Aware X-ray Generation via Controllable Diffusion Model
Anees Ur Rehman Hashmi
Ibrahim Almakky
Mohammad Areeb Qazi
Santosh Sanjeev
Vijay Ram Papineni
Dwarikanath Mahapatra
Mohammad Yaqub
MedIm
38
5
0
14 Mar 2024
DivCon: Divide and Conquer for Progressive Text-to-Image Generation
DivCon: Divide and Conquer for Progressive Text-to-Image Generation
Yuhao Jia
Wenhan Tan
DiffM
39
1
0
11 Mar 2024
HyperSDFusion: Bridging Hierarchical Structures in Language and Geometry
  for Enhanced 3D Text2Shape Generation
HyperSDFusion: Bridging Hierarchical Structures in Language and Geometry for Enhanced 3D Text2Shape Generation
Zhiying Leng
Tolga Birdal
Xiaohui Liang
Federico Tombari
36
3
0
01 Mar 2024
Large-scale Reinforcement Learning for Diffusion Models
Large-scale Reinforcement Learning for Diffusion Models
Yinan Zhang
Eric Tzeng
Yilun Du
Dmitry Kislyuk
VLM
26
29
0
20 Jan 2024
PALP: Prompt Aligned Personalization of Text-to-Image Models
PALP: Prompt Aligned Personalization of Text-to-Image Models
Moab Arar
Andrey Voynov
Amir Hertz
Omri Avrahami
Shlomi Fruchter
Yael Pritch
Daniel Cohen-Or
Ariel Shamir
DiffM
21
19
0
11 Jan 2024
CONFORM: Contrast is All You Need For High-Fidelity Text-to-Image
  Diffusion Models
CONFORM: Contrast is All You Need For High-Fidelity Text-to-Image Diffusion Models
Tuna Han Salih Meral
Enis Simsar
Federico Tombari
Pinar Yanardag
DiffM
VLM
22
26
0
11 Dec 2023
Correcting Diffusion Generation through Resampling
Correcting Diffusion Generation through Resampling
Yujian Liu
Yang Zhang
Tommi Jaakkola
Shiyu Chang
18
6
0
10 Dec 2023
PICTURE: PhotorealistIC virtual Try-on from UnconstRained dEsigns
PICTURE: PhotorealistIC virtual Try-on from UnconstRained dEsigns
Shuliang Ning
Duomin Wang
Yipeng Qin
Zirong Jin
Baoyuan Wang
Xiaoguang Han
DiffM
25
11
0
07 Dec 2023
SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models
SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models
Yuwei Guo
Ceyuan Yang
Anyi Rao
Maneesh Agrawala
Dahua Lin
Bo Dai
DiffM
VGen
18
113
0
28 Nov 2023
Reason out Your Layout: Evoking the Layout Master from Large Language
  Models for Text-to-Image Synthesis
Reason out Your Layout: Evoking the Layout Master from Large Language Models for Text-to-Image Synthesis
Xiaohui Chen
Yongfei Liu
Yingxiang Yang
Jianbo Yuan
Quanzeng You
Liping Liu
Hongxia Yang
DiffM
39
11
0
28 Nov 2023
A Picture is Worth a Thousand Words: Principled Recaptioning Improves
  Image Generation
A Picture is Worth a Thousand Words: Principled Recaptioning Improves Image Generation
Eyal Segalis
Dani Valevski
Danny Lumen
Yossi Matias
Yaniv Leviathan
DiffM
31
22
0
25 Oct 2023
Progressive Text-to-Image Diffusion with Soft Latent Direction
Progressive Text-to-Image Diffusion with Soft Latent Direction
Yuteng Ye
Jiale Cai
Hang Zhou
Guanwen Li
Youjia Zhang
Zikai Song
Chenxing Gao
Junqing Yu
Wei Yang
28
5
0
18 Sep 2023
MaskDiffusion: Boosting Text-to-Image Consistency with Conditional Mask
MaskDiffusion: Boosting Text-to-Image Consistency with Conditional Mask
Yupeng Zhou
Daquan Zhou
Zuo-Liang Zhu
Yaxing Wang
Qibin Hou
Jiashi Feng
8
10
0
08 Sep 2023
Subject-Diffusion:Open Domain Personalized Text-to-Image Generation
  without Test-time Fine-tuning
Subject-Diffusion:Open Domain Personalized Text-to-Image Generation without Test-time Fine-tuning
Jiancang Ma
Junhao Liang
Chen Chen
H. Lu
18
138
0
21 Jul 2023
Linguistic Binding in Diffusion Models: Enhancing Attribute
  Correspondence through Attention Map Alignment
Linguistic Binding in Diffusion Models: Enhancing Attribute Correspondence through Attention Map Alignment
Royi Rassin
Eran Hirsch
Daniel Glickman
Shauli Ravfogel
Yoav Goldberg
Gal Chechik
DiffM
20
100
0
15 Jun 2023
Grounded Text-to-Image Synthesis with Attention Refocusing
Grounded Text-to-Image Synthesis with Attention Refocusing
Quynh Phung
Songwei Ge
Jia-Bin Huang
DiffM
18
104
0
08 Jun 2023
LayoutGPT: Compositional Visual Planning and Generation with Large
  Language Models
LayoutGPT: Compositional Visual Planning and Generation with Large Language Models
Weixi Feng
Wanrong Zhu
Tsu-jui Fu
Varun Jampani
Arjun Reddy Akula
Xuehai He
Sugato Basu
X. Wang
William Yang Wang
MLLM
20
160
0
24 May 2023
Compositional Text-to-Image Synthesis with Attention Map Control of
  Diffusion Models
Compositional Text-to-Image Synthesis with Attention Map Control of Diffusion Models
Ruichen Wang
Zekang Chen
Chen Chen
Jiancang Ma
H. Lu
Xiaodong Lin
DiffM
39
66
0
23 May 2023
If at First You Don't Succeed, Try, Try Again: Faithful Diffusion-based
  Text-to-Image Generation by Selection
If at First You Don't Succeed, Try, Try Again: Faithful Diffusion-based Text-to-Image Generation by Selection
Shyamgopal Karthik
Karsten Roth
Massimiliano Mancini
Zeynep Akata
24
20
0
22 May 2023
Discffusion: Discriminative Diffusion Models as Few-shot Vision and
  Language Learners
Discffusion: Discriminative Diffusion Models as Few-shot Vision and Language Learners
Xuehai He
Weixi Feng
Tsu-jui Fu
Varun Jampani
Arjun Reddy Akula
P. Narayana
Sugato Basu
William Yang Wang
X. Wang
DiffM
32
7
0
18 May 2023
Diffusion-LM Improves Controllable Text Generation
Diffusion-LM Improves Controllable Text Generation
Xiang Lisa Li
John Thickstun
Ishaan Gulrajani
Percy Liang
Tatsunori B. Hashimoto
AI4CE
171
768
0
27 May 2022
RePaint: Inpainting using Denoising Diffusion Probabilistic Models
RePaint: Inpainting using Denoising Diffusion Probabilistic Models
Andreas Lugmayr
Martin Danelljan
Andrés Romero
F. I. F. Richard Yu
Radu Timofte
Luc Van Gool
DiffM
211
1,330
0
24 Jan 2022
Palette: Image-to-Image Diffusion Models
Palette: Image-to-Image Diffusion Models
Chitwan Saharia
William Chan
Huiwen Chang
Chris A. Lee
Jonathan Ho
Tim Salimans
David J. Fleet
Mohammad Norouzi
DiffM
VLM
325
1,570
0
10 Nov 2021
Zero-Shot Text-to-Image Generation
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,735
0
24 Feb 2021
1