Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding

Neural Information Processing Systems (NeurIPS), 2022

23 May 2022

Seyed Kamyar Seyed Ghasemipour

Burcu Karagol Ayan

S. S. Mahdavi

Raphael Gontijo-Lopes

David J Fleet

ArXiv (abs)PDF HTML HuggingFace (1 upvotes)

Papers citing "Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding"

50 / 5,039 papers shown

NeuroSwift: A Lightweight Cross-Subject Framework for fMRI Visual Reconstruction of Complex Scenes

Shiyi Zhang

Dong Liang

Yihang Zhou

161

02 Oct 2025

Leveraging Prior Knowledge of Diffusion Model for Person Search

104

02 Oct 2025

DragFlow: Unleashing DiT Priors with Region Based Supervision for Drag Editing

296

02 Oct 2025

PEO: Training-Free Aesthetic Quality Enhancement in Pre-Trained Text-to-Image Diffusion Models with Prompt Embedding Optimization

Hovhannes Margaryan

Bo Wan

Tinne Tuytelaars

280

02 Oct 2025

Towards Better Optimization For Listwise Preference in Diffusion Models

338

02 Oct 2025

Toward Safer Diffusion Language Models: Discovery and Mitigation of Priming Vulnerability

Shojiro Yamabe

Jun Sakuma

AAML

124

01 Oct 2025

Continuously Augmented Discrete Diffusion model for Categorical Generative Modeling

279

01 Oct 2025

Syntax-Guided Diffusion Language Models with User-Integrated Personalization

128

01 Oct 2025

ImageDoctor: Diagnosing Text-to-Image Generation via Grounded Image Reasoning

138

01 Oct 2025

Learn to Guide Your Diffusion Model

438

01 Oct 2025

Erased, But Not Forgotten: Erased Rectified Flow Transformers Still Remain Unsafe Under Concept Attack

184

01 Oct 2025

JEPA-T: Joint-Embedding Predictive Architecture with Text Fusion for Image Generation

...

01 Oct 2025

MetaLogic: Robustness Evaluation of Text-to-Image Models via Logically Equivalent Prompts

197

01 Oct 2025

Secure and Robust Watermarking for AI-generated Images: A Comprehensive Survey

184

30 Sep 2025

EVODiff: Entropy-aware Variance Optimized Diffusion Inference

154

30 Sep 2025

EchoGen: Generating Visual Echoes in Any Scene via Feed-Forward Subject-Driven Auto-Regressive Model

142

30 Sep 2025

Editable Noise Map Inversion: Encoding Target-image into Noise For High-Fidelity Image Manipulation

Mingyu Kang

Yong Suk Choi

DiffM

223

30 Sep 2025

VRWKV-Editor: Reducing quadratic complexity in transformer-based video editing

Abdelilah Aitrouga

Youssef Hmamouche

Amal El Fallah Seghrouchni

VGen

214

30 Sep 2025

Stitch: Training-Free Position Control in Multimodal Diffusion Transformers

151

30 Sep 2025

GaussEdit: Adaptive 3D Scene Editing with Text and Image PromptsIEEE Transactions on Visualization and Computer Graphics (TVCG), 2025

193

30 Sep 2025

CO3: Contrasting Concepts Compose Better

Debottam Dutta

Jianchong Chen

Rajalaxmi Rajagopalan

Yu-Lin Wei

Romit Roy Choudhury

DiffM

126

30 Sep 2025

Advantage Weighted Matching: Aligning RL with Pretraining in Diffusion Models

139

29 Sep 2025

U-DiT Policy: U-shaped Diffusion Transformers for Robotic Manipulation

29 Sep 2025

When Scores Learn Geometry: Rate Separations under the Manifold Hypothesis

1.4K

29 Sep 2025

DC-Gen: Post-Training Diffusion Acceleration with Deeply Compressed Latent Space

...

244

29 Sep 2025

Instruction Guided Multi Object Image Editing with Quantity and Layout Consistency

109

29 Sep 2025

DiffPCN: Latent Diffusion Model Based on Multi-view Depth Images for Point Cloud Completion

113

28 Sep 2025

HieraTok: Multi-Scale Visual Tokenizer Improves Image Reconstruction and Generation

150

28 Sep 2025

Token Painter: Training-Free Text-Guided Image Inpainting via Mask Autoregressive Models

116

28 Sep 2025

DiffInk: Glyph- and Style-Aware Latent Diffusion Transformer for Text to Online Handwriting Generation

122

28 Sep 2025

Diff-3DCap: Shape Captioning with Diffusion ModelsIEEE Transactions on Visualization and Computer Graphics (TVCG), 2025

123

28 Sep 2025

Griffin: Generative Reference and Layout Guided Image Composition

Aryan Mikaeili

Amirhossein Alimohammadi

28 Sep 2025

No Concept Left Behind: Test-Time Optimization for Compositional Text-to-Image Generation

Mohammad Hossein Sameti

Amir M. Mansourian

Arash Marioriyad

Soheil Fadaee Oshyani

M. Rohban

M. Baghshah

27 Sep 2025

Group Critical-token Policy Optimization for Autoregressive Image Generation

147

26 Sep 2025

Mind-the-Glitch: Visual Correspondence for Detecting Inconsistencies in Subject-Driven Generation

Abdelrahman Eldesokey

Aleksandar Cvejic

Bernard Ghanem

Peter Wonka

120

26 Sep 2025

UISim: An Interactive Image-Based UI Simulator for Dynamic Mobile Environments

26 Sep 2025

Training-Free Synthetic Data Generation with Dual IP-Adapter Guidance

107

26 Sep 2025

HiGS: History-Guided Sampling for Plug-and-Play Enhancement of Diffusion Models

160

26 Sep 2025

SemanticControl: A Training-Free Approach for Handling Loosely Aligned Visual Conditions in ControlNet

26 Sep 2025

High-Quality Sound Separation Across Diverse Categories via Visually-Guided Generative Modeling

144

26 Sep 2025

LABELING COPILOT: A Deep Research Agent for Automated Data Curation in Computer Vision

Debargha Ganguly

Sumit Kumar

Ishwar B Balappanawar

Weicong Chen

Shashank Kambhatla

Srinivasan Iyengar

Shivkumar Kalyanaraman

Ponnurangam Kumaraguru

Vipin Chaudhary

VLM

183

26 Sep 2025

FailureAtlas:Mapping the Failure Landscape of T2I Models via Active Exploration

...

100

26 Sep 2025

A Unified Framework for Diffusion Model Unlearning with f-Divergence

226

25 Sep 2025

A Single Neuron Works: Precise Concept Erasure in Text-to-Image Diffusion Models

25 Sep 2025

Prompt-aware classifier free guidance for diffusion models

Xuanhao Zhang

Chang Li

DiffM VLM

173

25 Sep 2025

Neptune-X: Active X-to-Maritime Generation for Universal Maritime Object Detection

246

25 Sep 2025

MotionFlow:Learning Implicit Motion Flow for Complex Camera Trajectory Control in Video Generation

103

25 Sep 2025

CusEnhancer: A Zero-Shot Scene and Controllability Enhancement Method for Photo Customization via ResInversion

Maoye Ren

Praneetha Vaddamanu

Jianjin Xu

Fernando De la Torre Frade

DiffM

25 Sep 2025

FreeInsert: Personalized Object Insertion with Geometric and Style Control

105

25 Sep 2025

SlimDiff: Training-Free, Activation-Guided Hands-free Slimming of Diffusion Models

Arani Roy

Shristi Das Biswas

Kaushik Roy

136

25 Sep 2025