ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.13655
  4. Cited By
LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image
  Diffusion Models with Large Language Models

LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models

23 May 2023
Long Lian
Boyi Li
Adam Yala
Trevor Darrell
ArXivPDFHTML

Papers citing "LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models"

50 / 122 papers shown
Title
Lay-Your-Scene: Natural Scene Layout Generation with Diffusion Transformers
Lay-Your-Scene: Natural Scene Layout Generation with Diffusion Transformers
Divyansh Srivastava
Xiang Zhang
He Wen
Chenru Wen
Zhuowen Tu
DiffM
26
0
0
07 May 2025
MCCD: Multi-Agent Collaboration-based Compositional Diffusion for Complex Text-to-Image Generation
MCCD: Multi-Agent Collaboration-based Compositional Diffusion for Complex Text-to-Image Generation
Mingcheng Li
Xiaolu Hou
Ziyang Liu
Dingkang Yang
Ziyun Qian
Jiawei Chen
Jinjie Wei
Y. Jiang
Qingyao Xu
L. Zhang
DiffM
51
0
0
05 May 2025
CoherenDream: Boosting Holistic Text Coherence in 3D Generation via Multimodal Large Language Models Feedback
CoherenDream: Boosting Holistic Text Coherence in 3D Generation via Multimodal Large Language Models Feedback
Chenhan Jiang
Yihan Zeng
Hang Xu
Dit-Yan Yeung
44
0
0
28 Apr 2025
ESPLoRA: Enhanced Spatial Precision with Low-Rank Adaption in Text-to-Image Diffusion Models for High-Definition Synthesis
ESPLoRA: Enhanced Spatial Precision with Low-Rank Adaption in Text-to-Image Diffusion Models for High-Definition Synthesis
Andrea Rigo
Luca Stornaiuolo
Mauro Martino
Bruno Lepri
N. Sebe
41
0
0
18 Apr 2025
Hierarchical and Step-Layer-Wise Tuning of Attention Specialty for Multi-Instance Synthesis in Diffusion Transformers
Hierarchical and Step-Layer-Wise Tuning of Attention Specialty for Multi-Instance Synthesis in Diffusion Transformers
Chunyang Zhang
Zhenhong Sun
Zhicheng Zhang
Junyan Wang
Yu Zhang
Dong Gong
H. Mo
Daoyi Dong
33
0
0
14 Apr 2025
Relation-Rich Visual Document Generator for Visual Information Extraction
Relation-Rich Visual Document Generator for Visual Information Extraction
Zi-Han Jiang
Chien-Wei Lin
Wei-Hua Li
Hsuan-Tung Liu
Yi-Ren Yeh
Chu-Song Chen
30
0
0
14 Apr 2025
Generating Fine Details of Entity Interactions
Generating Fine Details of Entity Interactions
Xinyi Gu
Jiayuan Mao
32
0
0
11 Apr 2025
Marmot: Multi-Agent Reasoning for Multi-Object Self-Correcting in Improving Image-Text Alignment
Marmot: Multi-Agent Reasoning for Multi-Object Self-Correcting in Improving Image-Text Alignment
Jiayang Sun
H. Wang
Jie Cao
Huaibo Huang
R. He
DiffM
68
0
0
10 Apr 2025
Implicit Bias Injection Attacks against Text-to-Image Diffusion Models
Implicit Bias Injection Attacks against Text-to-Image Diffusion Models
Huayang Huang
Xiangye Jin
Jiaxu Miao
Yu Wu
29
0
0
02 Apr 2025
Efficient Multi-Instance Generation with Janus-Pro-Dirven Prompt Parsing
Efficient Multi-Instance Generation with Janus-Pro-Dirven Prompt Parsing
Fan Qi
Yu Duan
Changsheng Xu
DiffM
47
0
0
27 Mar 2025
VerbDiff: Text-Only Diffusion Models with Enhanced Interaction Awareness
VerbDiff: Text-Only Diffusion Models with Enhanced Interaction Awareness
SeungJu Cha
Kwanyoung Lee
Ye-Chan Kim
Hyunwoo Oh
Dong-Jin Kim
41
0
0
20 Mar 2025
Visualizing Thought: Conceptual Diagrams Enable Robust Planning in LMMs
Visualizing Thought: Conceptual Diagrams Enable Robust Planning in LMMs
Nasim Borazjanizadeh
Roei Herzig
Eduard Oks
Trevor Darrell
Rogerio Feris
Leonid Karlinsky
LRM
48
0
0
14 Mar 2025
Investigating and Improving Counter-Stereotypical Action Relation in Text-to-Image Diffusion Models
Sina Malakouti
Adriana Kovashka
EGVM
62
0
0
13 Mar 2025
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing
Rongyao Fang
Chengqi Duan
Kun Wang
Linjiang Huang
Hao Li
...
Xingyu Zeng
R. Zhao
Jifeng Dai
Xihui Liu
Hongsheng Li
MLLM
ReLM
LRM
104
5
0
13 Mar 2025
CoSTA∗\ast∗: Cost-Sensitive Toolpath Agent for Multi-turn Image Editing
Advait Gupta
NandaKiran Velaga
Dang Nguyen
Tianyi Zhou
DiffM
59
0
0
13 Mar 2025
Fine-Grained Alignment and Noise Refinement for Compositional Text-to-Image Generation
Amir Mohammad Izadi
Seyed Mohsen Hosseini
Soroush Vafaie Tabar
Ali Abdollahi
Armin Saghafian
M. Baghshah
EGVM
37
0
0
09 Mar 2025
VisAgent: Narrative-Preserving Story Visualization Framework
Seungkwon Kim
GyuTae Park
Sangyeon Kim
Seung-Hun Nam
38
0
0
04 Mar 2025
DesignDiffusion: High-Quality Text-to-Design Image Generation with Diffusion Models
Zhendong Wang
Jianmin Bao
Shuyang Gu
Dong Chen
Wengang Zhou
H. Li
DiffM
47
0
0
03 Mar 2025
LLM-guided Instance-level Image Manipulation with Diffusion U-Net Cross-Attention Maps
LLM-guided Instance-level Image Manipulation with Diffusion U-Net Cross-Attention Maps
Andrey Palaev
Adil Mehmood Khan
S. M. Ahsan Kazmi
DiffM
48
0
0
23 Jan 2025
Know "No'' Better: A Data-Driven Approach for Enhancing Negation Awareness in CLIP
Know "No'' Better: A Data-Driven Approach for Enhancing Negation Awareness in CLIP
J. Park
Jungbeom Lee
Jongyoon Song
Sangwon Yu
Dahuin Jung
Sungroh Yoon
45
0
0
19 Jan 2025
Rare-to-Frequent: Unlocking Compositional Generation Power of Diffusion Models on Rare Concepts with LLM Guidance
Rare-to-Frequent: Unlocking Compositional Generation Power of Diffusion Models on Rare Concepts with LLM Guidance
Dongmin Park
Sebin Kim
Taehong Moon
Minkyu Kim
Kangwook Lee
Jaewoong Cho
DiffM
CoGe
62
2
0
08 Jan 2025
Visual Large Language Models for Generalized and Specialized Applications
Yifan Li
Zhixin Lai
Wentao Bao
Zhen Tan
Anh Dao
Kewei Sui
Jiayi Shen
Dong Liu
Huan Liu
Yu Kong
VLM
83
10
0
06 Jan 2025
Future Research Avenues for Artificial Intelligence in Digital Gaming:
  An Exploratory Report
Future Research Avenues for Artificial Intelligence in Digital Gaming: An Exploratory Report
Markus Dablander
73
0
0
18 Dec 2024
PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation
PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation
Qiyao Xue
Xiangyu Yin
Boyuan Yang
Wei Gao
DiffM
VGen
75
9
0
30 Nov 2024
Relations, Negations, and Numbers: Looking for Logic in Generative
  Text-to-Image Models
Relations, Negations, and Numbers: Looking for Logic in Generative Text-to-Image Models
C. Conwell
Rupert Tawiah-Quashie
T. Ullman
74
2
0
26 Nov 2024
DynamicAvatars: Accurate Dynamic Facial Avatars Reconstruction and
  Precise Editing with Diffusion Models
DynamicAvatars: Accurate Dynamic Facial Avatars Reconstruction and Precise Editing with Diffusion Models
Yangyang Qian
Yuan Sun
Yu-Xiao Guo
DiffM
86
0
0
24 Nov 2024
Evaluating the Generation of Spatial Relations in Text and Image
  Generative Models
Evaluating the Generation of Spatial Relations in Text and Image Generative Models
Shang Hong Sim
Clarence Lee
A. Tan
Cheston Tan
EGVM
25
2
0
12 Nov 2024
Token Merging for Training-Free Semantic Binding in Text-to-Image
  Synthesis
Token Merging for Training-Free Semantic Binding in Text-to-Image Synthesis
Taihang Hu
Linxuan Li
Joost van de Weijer
Hongcheng Gao
Fahad Shahbaz Khan
Jian Yang
Ming-Ming Cheng
Kai Wang
Yaxing Wang
DiffM
43
4
0
11 Nov 2024
GrounDiT: Grounding Diffusion Transformers via Noisy Patch
  Transplantation
GrounDiT: Grounding Diffusion Transformers via Noisy Patch Transplantation
Phillip Y. Lee
Taehoon Yoon
Minhyuk Sung
37
1
1
27 Oct 2024
Layout-your-3D: Controllable and Precise 3D Generation with 2D Blueprint
Layout-your-3D: Controllable and Precise 3D Generation with 2D Blueprint
Junwei Zhou
Xueting Li
Lu Qi
Ming Yang
DiffM
29
2
0
20 Oct 2024
HiCo: Hierarchical Controllable Diffusion Model for Layout-to-image
  Generation
HiCo: Hierarchical Controllable Diffusion Model for Layout-to-image Generation
Bo Cheng
Yuhang Ma
Liebucha Wu
Shanyuan Liu
Ao Ma
Xiaoyu Wu
Dawei Leng
Yuhui Yin
DiffM
14
8
0
18 Oct 2024
CtrlSynth: Controllable Image Text Synthesis for Data-Efficient
  Multimodal Learning
CtrlSynth: Controllable Image Text Synthesis for Data-Efficient Multimodal Learning
Qingqing Cao
Mahyar Najibi
Sachin Mehta
CLIP
DiffM
25
1
0
15 Oct 2024
MLP-SLAM: Multilayer Perceptron-Based Simultaneous Localization and
  Mapping With a Dynamic and Static Object Discriminator
MLP-SLAM: Multilayer Perceptron-Based Simultaneous Localization and Mapping With a Dynamic and Static Object Discriminator
Taozhe Li
Wei Sun
24
1
0
14 Oct 2024
Generating Intermediate Representations for Compositional Text-To-Image
  Generation
Generating Intermediate Representations for Compositional Text-To-Image Generation
Ran Galun
Sagie Benaim
13
0
0
13 Oct 2024
Semantic Score Distillation Sampling for Compositional Text-to-3D
  Generation
Semantic Score Distillation Sampling for Compositional Text-to-3D Generation
L. Yang
Zixiang Zhang
Junlin Han
Bohan Zeng
Runjia Li
Philip Torr
Wentao Zhang
26
2
0
11 Oct 2024
Boosting Few-Shot Detection with Large Language Models and
  Layout-to-Image Synthesis
Boosting Few-Shot Detection with Large Language Models and Layout-to-Image Synthesis
Ahmed Abdullah
Nikolas Ebert
Oliver Wasenmüller
ObjD
25
1
0
09 Oct 2024
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
Xinchen Zhang
Ling Yang
G. Li
Yaqi Cai
Jiake Xie
Yong Tang
Yujiu Yang
Mengdi Wang
Bin Cui
EGVM
CoGe
28
5
0
09 Oct 2024
Unsupervised Model Diagnosis
Unsupervised Model Diagnosis
Yinong Wang
Eileen Li
Jinqi Luo
Zhaoning Wang
Fernando De la Torre
AAML
14
1
0
08 Oct 2024
Insight: A Multi-Modal Diagnostic Pipeline using LLMs for Ocular Surface
  Disease Diagnosis
Insight: A Multi-Modal Diagnostic Pipeline using LLMs for Ocular Surface Disease Diagnosis
Chun-Hsiao Yeh
Jiayun Wang
A. Graham
Andrea J. Liu
Bo Tan
Yubei Chen
Yi Ma
Meng C. Lin
18
2
0
01 Oct 2024
MonoFormer: One Transformer for Both Diffusion and Autoregression
MonoFormer: One Transformer for Both Diffusion and Autoregression
Chuyang Zhao
Yuxing Song
Wenhao Wang
Haocheng Feng
Errui Ding
Yifan Sun
Xinyan Xiao
Jingdong Wang
DiffM
26
17
0
24 Sep 2024
ABHINAW: A method for Automatic Evaluation of Typography within
  AI-Generated Images
ABHINAW: A method for Automatic Evaluation of Typography within AI-Generated Images
Abhinaw Jagtap
Nachiket Tapas
R. G. Brajesh
EGVM
18
0
0
18 Sep 2024
Build-A-Scene: Interactive 3D Layout Control for Diffusion-Based Image
  Generation
Build-A-Scene: Interactive 3D Layout Control for Diffusion-Based Image Generation
Abdelrahman Eldesokey
Peter Wonka
DiffM
25
1
0
27 Aug 2024
Draw Like an Artist: Complex Scene Generation with Diffusion Model via
  Composition, Painting, and Retouching
Draw Like an Artist: Complex Scene Generation with Diffusion Model via Composition, Painting, and Retouching
Minghao Liu
Le Zhang
Yingjie Tian
Xiaochao Qu
Luoqi Liu
Ting Liu
DiffM
CoGe
24
2
0
25 Aug 2024
REVISION: Rendering Tools Enable Spatial Fidelity in Vision-Language
  Models
REVISION: Rendering Tools Enable Spatial Fidelity in Vision-Language Models
Agneet Chatterjee
Yiran Luo
Tejas Gokhale
Yezhou Yang
Chitta Baral
LRM
22
5
0
05 Aug 2024
ReCorD: Reasoning and Correcting Diffusion for HOI Generation
ReCorD: Reasoning and Correcting Diffusion for HOI Generation
Jian-Yu Jiang-Lin
Kang-Yang Huang
Ling Lo
Yi-Ning Huang
Terence Lin
Jhih-Ciang Wu
Hong-Han Shuai
Wen-Huang Cheng
DiffM
24
5
0
25 Jul 2024
The Fabrication of Reality and Fantasy: Scene Generation with
  LLM-Assisted Prompt Interpretation
The Fabrication of Reality and Fantasy: Scene Generation with LLM-Assisted Prompt Interpretation
Yi Yao
Chan-Feng Hsu
Jhe-Hao Lin
Hongxia Xie
Terence Lin
Yi-Ning Huang
Hong-Han Shuai
Wen-Huang Cheng
DiffM
24
4
0
17 Jul 2024
Physics-Inspired Generative Models in Medical Imaging: A Review
Physics-Inspired Generative Models in Medical Imaging: A Review
Dennis Hein
Afshin Bozorgpour
Dorit Merhof
Ge Wang
DiffM
MedIm
AI4CE
26
0
0
15 Jul 2024
Adversarial Attacks and Defenses on Text-to-Image Diffusion Models: A
  Survey
Adversarial Attacks and Defenses on Text-to-Image Diffusion Models: A Survey
Chenyu Zhang
Mingwang Hu
Wenhui Li
Lanjun Wang
37
13
0
10 Jul 2024
GenArtist: Multimodal LLM as an Agent for Unified Image Generation and
  Editing
GenArtist: Multimodal LLM as an Agent for Unified Image Generation and Editing
Zhenyu Wang
Aoxue Li
Zhenguo Li
Xihui Liu
MLLM
DiffM
36
25
0
08 Jul 2024
Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language
Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language
Yicheng Chen
Xiangtai Li
Yining Li
Yanhong Zeng
Jianzong Wu
Xiangyu Zhao
Kai Chen
VLM
DiffM
54
3
0
28 Jun 2024
123
Next