ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.11487
  4. Cited By
Photorealistic Text-to-Image Diffusion Models with Deep Language
  Understanding

Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding

Neural Information Processing Systems (NeurIPS), 2022
23 May 2022
Chitwan Saharia
William Chan
Saurabh Saxena
Lala Li
Jay Whang
Emily L. Denton
Seyed Kamyar Seyed Ghasemipour
Burcu Karagol Ayan
S. S. Mahdavi
Raphael Gontijo-Lopes
Tim Salimans
Jonathan Ho
David J Fleet
Mohammad Norouzi
    VLM
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)

Papers citing "Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding"

50 / 5,040 papers shown
ProxT2I: Efficient Reward-Guided Text-to-Image Generation via Proximal Diffusion
ProxT2I: Efficient Reward-Guided Text-to-Image Generation via Proximal Diffusion
Zhenghan Fang
Jian Zheng
Qiaozi Gao
Xiaofeng Gao
Jeremias Sulam
213
0
0
24 Nov 2025
LAA3D: A Benchmark of Detecting and Tracking Low-Altitude Aircraft in 3D Space
LAA3D: A Benchmark of Detecting and Tracking Low-Altitude Aircraft in 3D Space
Hai Wu
Shuai Tang
Jiale Wang
Longkun Zou
Mingyue Guo
Rongqin Liang
Ke Chen
Yaowei Wang
143
1
0
24 Nov 2025
Zero-Shot Video Deraining with Video Diffusion Models
Zero-Shot Video Deraining with Video Diffusion Models
Tuomas Varanka
Juan Luis Gonzalez
Hyeongwoo Kim
Pablo Garrido
Xu Yao
DiffMVGen
148
0
0
23 Nov 2025
Uni-DAD: Unified Distillation and Adaptation of Diffusion Models for Few-step Few-shot Image Generation
Uni-DAD: Unified Distillation and Adaptation of Diffusion Models for Few-step Few-shot Image Generation
Yara Bahram
Melodie Desbos
Luke McCaffrey
Eric Granger
DiffM
81
0
0
23 Nov 2025
Synthetic Curriculum Reinforces Compositional Text-to-Image Generation
Synthetic Curriculum Reinforces Compositional Text-to-Image Generation
Shijian Wang
Runhao Fu
Siyi Zhao
Qingqin Zhan
Xingjian Wang
Jiarui Jin
Yuan Lu
Hanqian Wu
Cunjian Chen
EGVM
226
0
0
23 Nov 2025
ViMix-14M: A Curated Multi-Source Video-Text Dataset with Long-Form, High-Quality Captions and Crawl-Free Access
ViMix-14M: A Curated Multi-Source Video-Text Dataset with Long-Form, High-Quality Captions and Crawl-Free Access
Timing Yang
Sucheng Ren
Alan Yuille
Feng Wang
VGen
123
0
0
23 Nov 2025
MINDiff: Mask-Integrated Negative Attention for Controlling Overfitting in Text-to-Image Personalization
MINDiff: Mask-Integrated Negative Attention for Controlling Overfitting in Text-to-Image Personalization
Seulgi Jeong
Jaeil Kim
DiffM
136
0
0
22 Nov 2025
UltraFlux: Data-Model Co-Design for High-quality Native 4K Text-to-Image Generation across Diverse Aspect Ratios
UltraFlux: Data-Model Co-Design for High-quality Native 4K Text-to-Image Generation across Diverse Aspect Ratios
Tian Ye
Song Fei
Lei Zhu
92
0
0
22 Nov 2025
Where Culture Fades: Revealing the Cultural Gap in Text-to-Image Generation
Where Culture Fades: Revealing the Cultural Gap in Text-to-Image Generation
Chuancheng Shi
Shangze Li
Shiming Guo
Simiao Xie
Wenhua Wu
...
Canran Xiao
Cong Wang
Zifeng Cheng
Fei Shen
Tat-Seng Chua
VLM
226
0
0
21 Nov 2025
Personalized Reward Modeling for Text-to-Image Generation
Personalized Reward Modeling for Text-to-Image Generation
Jeongeun Lee
Ryang Heo
Dongha Lee
EGVM
156
0
0
21 Nov 2025
EvDiff: High Quality Video with an Event Camera
EvDiff: High Quality Video with an Event Camera
Weilun Li
Lei-huan Sun
Ruixi Gao
Qi Jiang
Yuqin Ma
Kaiwei Wang
M. Yang
Luc Van Gool
D. Paudel
DiffMVGen
184
0
0
21 Nov 2025
Boosting Predictive Performance on Tabular Data through Data Augmentation with Latent-Space Flow-Based Diffusion
Md. Tawfique Ihsan
Md. Rakibul Hasan Rafi
Ahmed Shoyeb Raihan
Imtiaz Ahmed
Abdullahil Azeem
DiffM
147
0
0
20 Nov 2025
SVG360: Multi-View SVG Generation with Geometric and Color Consistency from a Single SVG
SVG360: Multi-View SVG Generation with Geometric and Color Consistency from a Single SVG
Mengnan Jiang
Zhaolin Sun
Christian Franke
Michele Franco Adesso
Antonio Haas
Grace Li Zhang
3DGS
216
0
0
20 Nov 2025
PairHuman: A High-Fidelity Photographic Dataset for Customized Dual-Person Generation
PairHuman: A High-Fidelity Photographic Dataset for Customized Dual-Person GenerationInformation Fusion (Inf. Fusion), 2025
Ting Pan
Ye Wang
Peiguang Jing
Rui Ma
Zili Yi
Y. Liu
261
0
0
20 Nov 2025
PEPPER: Perception-Guided Perturbation for Robust Backdoor Defense in Text-to-Image Diffusion Models
PEPPER: Perception-Guided Perturbation for Robust Backdoor Defense in Text-to-Image Diffusion Models
Oscar Chew
Po-Yi Lu
Jayden Lin
Kuan-Hao Huang
Hsuan-Tien Lin
DiffMAAML
168
0
0
20 Nov 2025
Thinking-while-Generating: Interleaving Textual Reasoning throughout Visual Generation
Ziyu Guo
Renrui Zhang
Hongyu Li
M. Zhang
Xinyan Chen
Sifan Wang
Yan Feng
Peng Pei
Pheng-Ann Heng
245
4
0
20 Nov 2025
Towards Overcoming Data Scarcity in Nuclear Energy: A Study on Critical Heat Flux with Physics-consistent Conditional Diffusion Model
Farah Alsafadi
Alexandra Akins
Xu Wu
DiffM
233
0
0
20 Nov 2025
UniHOI: Unified Human-Object Interaction Understanding via Unified Token Space
UniHOI: Unified Human-Object Interaction Understanding via Unified Token Space
Panqi Yang
Haodong Jing
Nanning Zheng
Yongqiang Ma
216
0
0
19 Nov 2025
SplitFlux: Learning to Decouple Content and Style from a Single Image
SplitFlux: Learning to Decouple Content and Style from a Single Image
Yitong Yang
Y Samuel Wang
Changshuo Wang
Yongjun Zhang
Ziyang Chen
Shuting He
212
0
0
19 Nov 2025
Towards Deploying VLA without Fine-Tuning: Plug-and-Play Inference-Time VLA Policy Steering via Embodied Evolutionary Diffusion
Towards Deploying VLA without Fine-Tuning: Plug-and-Play Inference-Time VLA Policy Steering via Embodied Evolutionary Diffusion
Zhuo Li
Junjia Liu
Zhipeng Dong
Tao Teng
Quentin Rouxel
D. Caldwell
Fei Chen
88
0
0
18 Nov 2025
Coffee: Controllable Diffusion Fine-tuning
Coffee: Controllable Diffusion Fine-tuning
Ziyao Zeng
Jingcheng Ni
Ruyi Liu
Alex Wong
DiffM
173
0
0
18 Nov 2025
Uni-Inter: Unifying 3D Human Motion Synthesis Across Diverse Interaction Contexts
Uni-Inter: Unifying 3D Human Motion Synthesis Across Diverse Interaction Contexts
Sheng Liu
Yuanzhi Liang
Jiepeng Wang
Sidan Du
C. Zhang
Xuelong Li
179
0
0
17 Nov 2025
Infinite-Story: A Training-Free Consistent Text-to-Image Generation
Infinite-Story: A Training-Free Consistent Text-to-Image Generation
Jihun Park
Kyoungmin Lee
Jongmin Gim
Hyeonseo Jo
Minseok Oh
Wonhyeok Choi
K. Hwang
Jaeyeul Kim
Minwoo Choi
S. Im
111
0
1
17 Nov 2025
Unlocking the Forgery Detection Potential of Vanilla MLLMs: A Novel Training-Free Pipeline
Unlocking the Forgery Detection Potential of Vanilla MLLMs: A Novel Training-Free Pipeline
Rui Zuo
Qinyue Tong
Zhe-ming Lu
Ziqian Lu
160
0
0
17 Nov 2025
DriveLiDAR4D: Sequential and Controllable LiDAR Scene Generation for Autonomous Driving
DriveLiDAR4D: Sequential and Controllable LiDAR Scene Generation for Autonomous Driving
Kaiwen Cai
Xinze Liu
Xia Zhou
Hengtong Hu
Jie Xiang
Luyao Zhang
Xueyang Zhang
Kun Zhan
Yifei Zhan
Xianpeng Lang
3DPC
291
0
0
17 Nov 2025
Free-Form Scene Editor: Enabling Multi-Round Object Manipulation like in a 3D Engine
Free-Form Scene Editor: Enabling Multi-Round Object Manipulation like in a 3D Engine
Xincheng Shuai
Zhenyuan Qin
Henghui Ding
Dacheng Tao
DiffM
167
0
0
17 Nov 2025
HiGFA: Hierarchical Guidance for Fine-grained Data Augmentation with Diffusion Models
HiGFA: Hierarchical Guidance for Fine-grained Data Augmentation with Diffusion Models
Zhiguang Lu
Qianqian Xu
Peisong Wen
Siran Da
Qingming Huang
DiffM
696
0
0
16 Nov 2025
GeoMVD: Geometry-Enhanced Multi-View Generation Model Based on Geometric Information Extraction
GeoMVD: Geometry-Enhanced Multi-View Generation Model Based on Geometric Information Extraction
Jiaqi Wu
Yaosen Chen
Shuyuan Zhu
VGen
310
0
0
15 Nov 2025
Mixture of States: Routing Token-Level Dynamics for Multimodal Generation
Mixture of States: Routing Token-Level Dynamics for Multimodal Generation
Haozhe Liu
Ding Liu
Mingchen Zhuge
Zijian Zhou
Tian Xie
...
Juan-Manuel Perez-Rua
Tao Xiang
Wei Liu
Shikun Liu
Jürgen Schmidhuber
105
0
0
15 Nov 2025
Selecting Fine-Tuning Examples by Quizzing VLMs
Selecting Fine-Tuning Examples by Quizzing VLMs
Tenghao Ji
Eytan Adar
DiffM
120
0
0
15 Nov 2025
Fair Incentives for Early Arrival in 0-1 Cooperative Games
Fair Incentives for Early Arrival in 0-1 Cooperative Games
Yaoxin Ge
Yao Zhang
Dengji Zhao
111
0
0
14 Nov 2025
Prompt Triage: Structured Optimization Enhances Vision-Language Model Performance on Medical Imaging Benchmarks
Prompt Triage: Structured Optimization Enhances Vision-Language Model Performance on Medical Imaging Benchmarks
Arnav Singhvi
Vasiliki Bikia
Asad Aali
Akshay S. Chaudhari
Roxana Daneshjou
LM&MAVLM
282
0
0
14 Nov 2025
Laytrol: Preserving Pretrained Knowledge in Layout Control for Multimodal Diffusion Transformers
Laytrol: Preserving Pretrained Knowledge in Layout Control for Multimodal Diffusion Transformers
Sida Huang
Siqi Huang
Ping Luo
Hongyuan Zhang
DiffM
288
3
0
11 Nov 2025
Towards Fine-Grained Interpretability: Counterfactual Explanations for Misclassification with Saliency Partition
Towards Fine-Grained Interpretability: Counterfactual Explanations for Misclassification with Saliency PartitionComputer Vision and Pattern Recognition (CVPR), 2025
Lintong Zhang
Kang Yin
Seong-Whan Lee
FAtt
461
0
0
11 Nov 2025
Top2Ground: A Height-Aware Dual Conditioning Diffusion Model for Robust Aerial-to-Ground View Generation
Top2Ground: A Height-Aware Dual Conditioning Diffusion Model for Robust Aerial-to-Ground View Generation
Jae Joong Lee
Bedrich Benes
DiffM
136
0
0
11 Nov 2025
Beyond Randomness: Understand the Order of the Noise in Diffusion
Beyond Randomness: Understand the Order of the Noise in Diffusion
Song Yan
Min Li
Bi Xinliang
J. Yang
Yusen Zhang
Guanye Xiong
Yunwei Lan
Tao Zhang
Wei Zhai
Zheng-jun Zha
DiffM
316
0
0
11 Nov 2025
LiteUpdate: A Lightweight Framework for Updating AI-Generated Image Detectors
LiteUpdate: A Lightweight Framework for Updating AI-Generated Image Detectors
Jiajie Lu
Zhenkan Fu
Na Zhao
Long Xing
Kejiang Chen
Weiming Zhang
Nenghai Yu
VLM
190
1
0
10 Nov 2025
Generating an Image From 1,000 Words: Enhancing Text-to-Image With Structured Captions
Generating an Image From 1,000 Words: Enhancing Text-to-Image With Structured Captions
Eyal Gutflaish
Eliran Kachlon
Hezi Zisman
Tal Hacham
Nimrod Sarid
...
Saar Huberman
Gal Davidi
Guy Bukchin
Kfir Goldberg
Ron Mokady
DiffMVLM
213
2
0
10 Nov 2025
Toward the Frontiers of Reliable Diffusion Sampling via Adversarial Sinkhorn Attention Guidance
Toward the Frontiers of Reliable Diffusion Sampling via Adversarial Sinkhorn Attention Guidance
Kwanyoung Kim
DiffM
216
0
0
10 Nov 2025
A Two-Stage System for Layout-Controlled Image Generation using Large Language Models and Diffusion Models
A Two-Stage System for Layout-Controlled Image Generation using Large Language Models and Diffusion Models
Jan-Hendrik Koch
Jonas Krumme
Konrad Gadzicki
DiffM
593
0
0
10 Nov 2025
Test-Time Iterative Error Correction for Efficient Diffusion Models
Test-Time Iterative Error Correction for Efficient Diffusion Models
Yunshan Zhong
Yanwei Qi
Yuxin Zhang
161
0
0
09 Nov 2025
MALeR: Improving Compositional Fidelity in Layout-Guided Generation
MALeR: Improving Compositional Fidelity in Layout-Guided Generation
Shivank Saxena
D. Srivastava
Makarand Tapaswi
DiffM
135
0
0
08 Nov 2025
FreeControl: Efficient, Training-Free Structural Control via One-Step Attention Extraction
FreeControl: Efficient, Training-Free Structural Control via One-Step Attention Extraction
Jiang Lin
Xinyu Chen
Song Wu
Zhiqiu Zhang
Jizhi Zhang
Ye Wang
Qiang Tang
Qian Wang
Jian Yang
Zili Yi
DiffM
132
0
0
07 Nov 2025
SAD-Flower: Flow Matching for Safe, Admissible, and Dynamically Consistent Planning
SAD-Flower: Flow Matching for Safe, Admissible, and Dynamically Consistent Planning
T. Huang
Armin Lederer
Dai-Jie Wu
X. Dai
Sihua Zhang
Stefan Sosnowski
Shao-Hua Sun
Sandra Hirche
189
0
0
07 Nov 2025
Personalized Image Editing in Text-to-Image Diffusion Models via Collaborative Direct Preference Optimization
Personalized Image Editing in Text-to-Image Diffusion Models via Collaborative Direct Preference Optimization
Connor Dunlop
Matthew Zheng
Kavana Venkatesh
Pinar Yanardag
DiffM
93
1
0
06 Nov 2025
Finetuning-Free Personalization of Text to Image Generation via Hypernetworks
Finetuning-Free Personalization of Text to Image Generation via Hypernetworks
Sagar Shrestha
Gopal Sharma
Luowei Zhou
Suren Kumar
DiffM
160
0
0
05 Nov 2025
EVLP:Learning Unified Embodied Vision-Language Planner with Reinforced Supervised Fine-Tuning
EVLP:Learning Unified Embodied Vision-Language Planner with Reinforced Supervised Fine-Tuning
Xinyan Cai
Shiguang Wu
Dafeng Chi
Yuzheng Zhuang
Xingyue Quan
Jianye Hao
Qiang Guan
100
0
0
03 Nov 2025
NSYNC: Negative Synthetic Image Generation for Contrastive Training to Improve Stylized Text-To-Image Translation
NSYNC: Negative Synthetic Image Generation for Contrastive Training to Improve Stylized Text-To-Image Translation
Serkan Ozturk
Samet Hicsonmez
Pinar Duygulu
DiffM
346
0
0
03 Nov 2025
XFlowMP: Task-Conditioned Motion Fields for Generative Robot Planning with Schrodinger Bridges
Khang Nguyen
Minh Nhat Vu
71
0
0
02 Nov 2025
Enhancing Frequency Forgery Clues for Diffusion-Generated Image Detection
Enhancing Frequency Forgery Clues for Diffusion-Generated Image Detection
D. Zhang
Tong Zhang
Shiming Ge
Sabine Süsstrunk
DiffMAAML
185
0
0
01 Nov 2025
Previous
12345...99100101
Next