Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding

Neural Information Processing Systems (NeurIPS), 2022

23 May 2022

Seyed Kamyar Seyed Ghasemipour

Burcu Karagol Ayan

S. S. Mahdavi

Raphael Gontijo-Lopes

David J Fleet

ArXiv (abs)PDF HTML HuggingFace (1 upvotes)

Papers citing "Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding"

50 / 5,039 papers shown

Prompt-aware classifier free guidance for diffusion models

Xuanhao Zhang

Chang Li

DiffM VLM

173

25 Sep 2025

MMG: Mutual Information Estimation via the MMSE Gap in Diffusion

217

24 Sep 2025

PhysCtrl: Generative Physics for Controllable and Physics-Grounded Video Generation

619

24 Sep 2025

Efficient Encoder-Free Pose Conditioning and Pose Control for Virtual Try-On

212

24 Sep 2025

InstructVTON: Optimal Auto-Masking and Natural-Language-Guided Interactive Style Control for Inpainting-Based Virtual Try-On

156

24 Sep 2025

OmniBridge: Unified Multimodal Understanding, Generation, and Retrieval via Latent Space Alignment

Teng Xiao

Zuchao Li

Lefei Zhang

178

23 Sep 2025

Text Slider: Efficient and Plug-and-Play Continuous Concept Control for Image/Video Synthesis via LoRA Adapters

124

23 Sep 2025

Synthesizing Artifact Dataset for Pixel-level Detection

Dennis Menn

Feng Liang

Diana Marculescu

104

23 Sep 2025

How Far are VLMs from Visual Spatial Intelligence? A Benchmark-Driven Perspective

...

319

23 Sep 2025

Training-Free Multi-Style Fusion Through Reference-Based Adaptive Modulation

134

23 Sep 2025

AGSwap: Overcoming Category Boundaries in Object Fusion via Adaptive Group Swapping

222

23 Sep 2025

Automated Prompt Generation for Creative and Counterfactual Text-to-image Synthesis

23 Sep 2025

Seg4Diff: Unveiling Open-Vocabulary Segmentation in Text-to-Image Diffusion Transformers

184

22 Sep 2025

Audio Super-Resolution with Latent Bridge Models

331

22 Sep 2025

MEF: A Systematic Evaluation Framework for Text-to-Image Models

158

22 Sep 2025

ComposeMe: Attribute-Specific Image Prompts for Controllable Human Image Generation

223

22 Sep 2025

Semantic and Visual Crop-Guided Diffusion Models for Heterogeneous Tissue Synthesis in Histopathology

294

22 Sep 2025

Multi-scale Temporal Prediction via Incremental Generation and Multi-agent Collaboration

252

22 Sep 2025

Single-Image Depth from Defocus with Coded Aperture and Diffusion Posterior Sampling

Hodaka Kawachi

Jose Reinaldo Cunha Santos A. V. Silva Neto

124

22 Sep 2025

Spiffy: Multiplying Diffusion LLM Acceleration via Lossless Speculative Decoding

208

22 Sep 2025

Multimodal Prompt Decoupling Attack on the Safety Filters in Text-to-Image Models

128

21 Sep 2025

Stencil: Subject-Driven Generation with Context GuidanceInternational Conference on Information Photonics (ICIP), 2025

130

21 Sep 2025

PRISM: Precision-Recall Informed Data-Free Knowledge Distillation via Generative Diffusion

145

21 Sep 2025

Follow-Your-Emoji-Faster: Towards Efficient, Fine-Controllable, and Expressive Freestyle Portrait Animation

...

267

20 Sep 2025

InstanceAssemble: Layout-Aware Image Generation via Instance Assembling Attention

296

20 Sep 2025

MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer

...

204

19 Sep 2025

PolyJuice Makes It Real: Black-Box, Universal Red Teaming for Synthetic Image Detectors

184

19 Sep 2025

Lynx: Towards High-Fidelity Personalized Video Generation

208

19 Sep 2025

CIDER: A Causal Cure for Brand-Obsessed Text-to-Image Models

124

19 Sep 2025

DiffusionNFT: Online Diffusion Reinforcement with Forward Process

240

19 Sep 2025

The Iconicity of the Generated Image

Nanne van Noord

Noa Garcia

152

19 Sep 2025

Diffusion-Based Cross-Modal Feature Extraction for Multi-Label Classification

Tian Lan

Yiming Zheng

Jianxin Yin

152

19 Sep 2025

Causal Reasoning Elicits Controllable 3D Scene Generation

110

18 Sep 2025

AutoEdit: Automatic Hyperparameter Tuning for Image Editing

189

18 Sep 2025

LowDiff: Efficient Diffusion Sampling with Low-Resolution Condition

156

18 Sep 2025

Geometric Image Synchronization with Deep Watermarking

Sylvestre-Alvise Rebuffi

321

18 Sep 2025

Noise-Level Diffusion Guidance: Well Begun is Half Done

162

17 Sep 2025

BWCache: Accelerating Video Diffusion Transformers through Block-Wise Caching

301

17 Sep 2025

Towards Robust Defense against Customization via Protective Perturbation Resistant to Diffusion-based Purification

286

17 Sep 2025

BiasMap: Leveraging Cross-Attentions to Discover and Mitigate Hidden Social Biases in Text-to-Image Generation

Rajatsubhra Chakraborty

104

16 Sep 2025

Adaptive Sampling Scheduler

16 Sep 2025

Double Helix Diffusion for Cross-Domain Anomaly Image Generation

168

16 Sep 2025

MIA-EPT: Membership Inference Attack via Error Prediction for Tabular Data

208

16 Sep 2025

SpeCa: Accelerating Diffusion Transformers with Speculative Feature Caching

214

15 Sep 2025

Flow Straight and Fast in Hilbert Space: Functional Rectified Flow

Jianxin Zhang

Clayton Scott

140

12 Sep 2025

A Discrepancy-Based Perspective on Dataset Condensation

Tong Chen

Raghavendra Selvan

261

12 Sep 2025

Compute Only 16 Tokens in One Timestep: Accelerating Diffusion Transformers with Cluster-Driven Feature Caching

144

12 Sep 2025

InfGen: A Resolution-Agnostic Paradigm for Scalable Image Synthesis

122

12 Sep 2025

Maestro: Self-Improving Text-to-Image Generation via Agent Orchestration

246

12 Sep 2025

MagicMirror: A Large-Scale Dataset and Benchmark for Fine-Grained Artifacts Assessment in Text-to-Image Generation

188

12 Sep 2025