Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding

Neural Information Processing Systems (NeurIPS), 2022

23 May 2022

Seyed Kamyar Seyed Ghasemipour

Burcu Karagol Ayan

S. S. Mahdavi

Raphael Gontijo-Lopes

David J Fleet

ArXiv (abs)PDF HTML HuggingFace (1 upvotes)

Papers citing "Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding"

50 / 5,039 papers shown

NDM: A Noise-driven Detection and Mitigation Framework against Implicit Sexual Intentions in Text-to-Image Generation

156

17 Oct 2025

Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset

...

374

17 Oct 2025

Face-MakeUpV2: Facial Consistency Learning for Controllable Text-to-Image Generation

139

17 Oct 2025

Controlling the image generation process with parametric activation functions

Ilia Pavlov

GAN

227

17 Oct 2025

QSilk: Micrograin Stabilization and Adaptive Quantile Clipping for Detail-Friendly Latent Diffusion

Denis Rychkovskiy

148

17 Oct 2025

Salient Concept-Aware Generative Data Augmentation

203

16 Oct 2025

DeLeaker: Dynamic Inference-Time Reweighting For Semantic Leakage Mitigation in Text-to-Image Models

168

16 Oct 2025

Noise Projection: Closing the Prompt-Agnostic Gap Behind Text-to-Image Misalignment in Diffusion Models

108

16 Oct 2025

Consistent text-to-image generation via scene de-contextualization

124

16 Oct 2025

Adaptive Visual Conditioning for Semantic Consistency in Diffusion-Based Story Continuation

Seyed Mohammad Mousavi

Morteza Analoui

DiffM

124

15 Oct 2025

Ultra High-Resolution Image Inpainting with Patch-Based Content Consistency Adapter

235

15 Oct 2025

End-to-End Multi-Modal Diffusion Mamba

134

15 Oct 2025

NoisePrints: Distortion-Free Watermarks for Authorship in Private Diffusion Models

247

15 Oct 2025

MVP4D: Multi-View Portrait Video Diffusion for Animatable 4D Avatars

128

14 Oct 2025

Time-Correlated Video Bridge Matching

14 Oct 2025

Mitigating the Noise Shift for Denoising Generative Models via Noise Awareness Guidance

104

14 Oct 2025

SRUM: Fine-Grained Self-Rewarding for Unified Multimodal Models

208

14 Oct 2025

Massive Activations are the Key to Local Detail Synthesis in Diffusion Transformers

158

13 Oct 2025

VLM-Guided Adaptive Negative Prompting for Creative Generation

144

12 Oct 2025

Local-Global Context-Aware and Structure-Preserving Image Super-Resolution

274

11 Oct 2025

Few-shot multi-token DreamBooth with LoRa for style-consistent character generation

101

10 Oct 2025

Accent-Invariant Automatic Speech Recognition via Saliency-Driven Spectrogram Masking

Mohammad Hossein Sameti

Sepehr Harfi Moridani

Ali Zarean

Hossein Sameti

184

10 Oct 2025

Cross-Sensor Touch Generation

103

10 Oct 2025

GTAlign: Game-Theoretic Alignment of LLM Assistants for Social Welfare

Siqi Zhu

David Zhang

Pedro Cisneros-Velarde

J. You

LRM

204

10 Oct 2025

Reinforcing Diffusion Models by Direct Group Preference Optimization

Yihong Luo

Tianyang Hu

Jing Tang

145

09 Oct 2025

OBCache: Optimal Brain KV Cache Pruning for Efficient Long-Context LLM Inference

136

09 Oct 2025

InstructUDrag: Joint Text Instructions and Object Dragging for Interactive Image Editing

Haoran Yu

Yi Shi

DiffM

157

09 Oct 2025

FreqCa: Accelerating Diffusion Models via Frequency-Aware Caching

...

103

09 Oct 2025

UniVideo: Unified Understanding, Generation, and Editing for Videos

261

09 Oct 2025

Graph Conditioned Diffusion for Controllable Histopathology Image Generation

08 Oct 2025

Toward Reliable Clinical Coding with Language Models: Verification and Lightweight Adaptation

122

08 Oct 2025

Inconsistent Affective Reaction: Sentiment of Perception and Opinion in Urban EnvironmentsCAADRIA proceedings (CAADRIA), 2025

Jingfei Huang

Han Tu

204

08 Oct 2025

Sparse deepfake detection promotes better disentanglement

209

07 Oct 2025

Mitigating Surgical Data Imbalance with Dual-Prediction Video Diffusion Model

Danush Kumar Venkatesh

Adam Schmidt

Muhammad Abdullah Jamal

Omid Mohareri

VGen MedIm

143

07 Oct 2025

Redefining Generalization in Visual Domains: A Two-Axis Framework for Fake Image Detection with FusionDetect

239

07 Oct 2025

Teleportraits: Training-Free People Insertion into Any Scene

110

07 Oct 2025

Teamwork: Collaborative Diffusion with Low-rank Coordination and Adaptation

Sam Sartor

Pieter Peers

DiffM

160

07 Oct 2025

SAEdit: Token-level control for continuous image editing via Sparse AutoEncoder

315

06 Oct 2025

Asynchronous Denoising Diffusion Models for Aligning Text-to-Image Generation

188

06 Oct 2025

Beyond the Seen: Bounded Distribution Estimation for Open-Vocabulary Learning

160

06 Oct 2025

Self Speculative Decoding for Diffusion Large Language Models

312

05 Oct 2025

MorphoSim: An Interactive, Controllable, and Editable Language-guided 4D World Simulator

Xuehai He

Shijie Zhou

Thivyanth Venkateswaran

163

05 Oct 2025

Let Features Decide Their Own Solvers: Hybrid Feature Caching for Diffusion Transformers

147

05 Oct 2025

Variational Diffusion Unlearning: A Variational Inference Framework for Unlearning in Diffusion Models under Data Constraints

Subhodip Panda

MS Varun

Shreyans Jain

Sarthak Kumar Maharana

Prathosh A.P.

DiffM

266

05 Oct 2025

Person-Centric Annotations of LAION-400M: Auditing Bias and Its Transfer to Models

202

04 Oct 2025

Mirage: Unveiling Hidden Artifacts in Synthetic Images with Large Vision-Language Models

120

04 Oct 2025

Paris: A Decentralized Trained Open-Weight Diffusion Model

03 Oct 2025

HAVIR: HierArchical Vision to Image Reconstruction using CLIP-Guided Versatile Diffusion

230

03 Oct 2025

PocketSR: The Super-Resolution Expert in Your Pocket Mobiles

...

302

03 Oct 2025

DragFlow: Unleashing DiT Priors with Region Based Supervision for Drag Editing

304

02 Oct 2025