Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding

Neural Information Processing Systems (NeurIPS), 2022

23 May 2022

Seyed Kamyar Seyed Ghasemipour

Burcu Karagol Ayan

S. S. Mahdavi

Raphael Gontijo-Lopes

David J Fleet

ArXiv (abs)PDF HTML HuggingFace (1 upvotes)

Papers citing "Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding"

50 / 5,056 papers shown

Tailored Emotional LLM-Supporter: Enhancing Cultural Sensitivity

184

11 Aug 2025

Learning User Preferences for Image Generation Model

108

11 Aug 2025

Undress to Redress: A Training-Free Framework for Virtual Try-On

...

153

11 Aug 2025

Comparison Reveals Commonality: Customized Image Generation through Contrastive Inversion

111

11 Aug 2025

S^2VG: 3D Stereoscopic and Spatial Video Generation via Denoising Frame Matrix

173

11 Aug 2025

Efficient Approximate Posterior Sampling with Annealed Langevin Monte Carlo

Advait Parulekar

Litu Rout

Karthikeyan Shanmugam

Sanjay Shakkottai

223

11 Aug 2025

Exploring Multimodal Diffusion Transformers for Enhanced Prompt-based Image Editing

209

11 Aug 2025

Make Your MoVe: Make Your 3D Contents by Adapting Multi-View Diffusion Models to External Editing

113

11 Aug 2025

Consistent and Controllable Image Animation with Motion Linear Diffusion Transformers

283

10 Aug 2025

AR-GRPO: Training Autoregressive Image Generation Models via Reinforcement Learning

187

09 Aug 2025

Talk2Image: A Multi-Agent System for Multi-Turn Image Generation and Editing

163

09 Aug 2025

Towards Effective Prompt Stealing Attack against Text-to-Image Diffusion Models

317

09 Aug 2025

UNCAGE: Contrastive Attention Guidance for Masked Generative Transformers in Text-to-Image Generation

215

07 Aug 2025

FLUX-Makeup: High-Fidelity, Identity-Consistent, and Robust Makeup Transfer via Diffusion Transformer

...

127

07 Aug 2025

LayerT2V: A Unified Multi-Layer Video Generation Framework

Lei Zhang

Xiaohong Liu

DiffM VGen

188

06 Aug 2025

Voost: A Unified and Scalable Diffusion Transformer for Bidirectional Virtual Try-On and Try-Off

Seungyong Lee

Jeong-gi Kwak

DiffM

338

06 Aug 2025

Zero-Residual Concept Erasure via Progressive Alignment in Text-to-Image Model

194

06 Aug 2025

Slice or the Whole Pie? Utility Control for AI Models

Ye Tao

AAML

131

06 Aug 2025

HPSv3: Towards Wide-Spectrum Human Preference Score

188

05 Aug 2025

Diffusion Models with Adaptive Negative Sampling Without External Resources

Alakh Desai

Nuno Vasconcelos

DiffM

217

05 Aug 2025

Veila: Panoramic LiDAR Generation from a Monocular RGB Image

...

140

05 Aug 2025

VideoGuard: Protecting Video Content from Unauthorized Editing

155

05 Aug 2025

Draw Your Mind: Personalized Generation via Condition-Level Modeling in Text-to-Image Diffusion Models

216

05 Aug 2025

When Cars Have Stereotypes: Auditing Demographic Bias in Objects from Text-to-Image Models

Dasol Choi Jihwan Lee

309

05 Aug 2025

CoEmoGen: Towards Semantically-Coherent and Scalable Emotional Image Content Generation

209

05 Aug 2025

Dream-to-Recon: Monocular 3D Reconstruction with Diffusion-Depth Distillation from Single Images

188

04 Aug 2025

Forecasting When to Forecast: Accelerating Diffusion Models with Confidence-Gated TaylorKnowledge-Based Systems (KBS), 2025

405

04 Aug 2025

Practical, Generalizable and Robust Backdoor Attacks on Text-to-Image Diffusion Models

144

03 Aug 2025

DAG: Unleash the Potential of Diffusion Model for Open-Vocabulary 3D Affordance Grounding

220

03 Aug 2025

Versatile Transition Generation with Image-to-Video Diffusion

328

03 Aug 2025

Personalized Safety Alignment for Text-to-Image Diffusion Models

Xiao Zhang

Rex Ying

EGVM

277

02 Aug 2025

Dataset Condensation with Color Compensation

508

02 Aug 2025

ReCoSeg++:Extended Residual-Guided Cross-Modal Diffusion for Brain Tumor Segmentation

281

01 Aug 2025

Steering Guidance for Personalized Text-to-Image Diffusion Models

324

01 Aug 2025

Trans-Adapter: A Plug-and-Play Framework for Transparent Image Inpainting

221

01 Aug 2025

Controllable Pedestrian Video Editing for Multi-View Driving Scenarios via Motion Sequence

239

01 Aug 2025

Semantic and Temporal Integration in Latent Diffusion Space for High-Fidelity Video Super-Resolution

185

01 Aug 2025

UniEmo: Unifying Emotional Understanding and Generation with Learnable Expert Queries

253

31 Jul 2025

Training-free Geometric Image Editing on Diffusion Models

314

31 Jul 2025

LOTS of Fashion! Multi-Conditioning for Image Generation via Sketch-Text Pairing

281

30 Jul 2025

X-NeMo: Expressive Neural Motion Reenactment via Disentangled Latent AttentionInternational Conference on Learning Representations (ICLR), 2025

244

30 Jul 2025

On the Reliability of Vision-Language Models Under Adversarial Frequency-Domain Perturbations

262

30 Jul 2025

DepR: Depth Guided Single-view Scene Reconstruction with Instance-level Diffusion

201

30 Jul 2025

Trade-offs in Image Generation: How Do Different Dimensions Interact?

247

29 Jul 2025

X-Omni: Reinforcement Learning Makes Discrete Autoregressive Image Generative Models Great Again

...

218

29 Jul 2025

GuidPaint: Class-Guided Image Inpainting with Diffusion Models

304

29 Jul 2025

Compositional Video Synthesis by Temporal Object-Centric Learning

Adil Kaan Akan

Yucel Yemez

DiffM OCL

285

28 Jul 2025

Multimodal LLMs as Customized Reward Models for Text-to-Image Generation

507

28 Jul 2025

AIComposer: Any Style and Content Image Composition via Feature Integration

244

28 Jul 2025

On The Role of Pretrained Language Models in General-Purpose Text Embeddings: A Survey

344

28 Jul 2025