ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.11487
  4. Cited By
Photorealistic Text-to-Image Diffusion Models with Deep Language
  Understanding

Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding

23 May 2022
Chitwan Saharia
William Chan
Saurabh Saxena
Lala Li
Jay Whang
Emily L. Denton
Seyed Kamyar Seyed Ghasemipour
Burcu Karagol Ayan
S. S. Mahdavi
Raphael Gontijo-Lopes
Tim Salimans
Jonathan Ho
David J Fleet
Mohammad Norouzi
    VLM
ArXivPDFHTML

Papers citing "Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding"

50 / 4,309 papers shown
Title
PICO: Reconstructing 3D People In Contact with Objects
PICO: Reconstructing 3D People In Contact with Objects
Alpár Cseke
Shashank Tripathi
Sai Kumar Dwivedi
Arjun Lakshmipathy
Agniv Chatterjee
M. Black
Dimitrios Tzionas
3DH
35
1
0
24 Apr 2025
Beyond Labels: Zero-Shot Diabetic Foot Ulcer Wound Segmentation with Self-attention Diffusion Models and the Potential for Text-Guided Customization
Beyond Labels: Zero-Shot Diabetic Foot Ulcer Wound Segmentation with Self-attention Diffusion Models and the Potential for Text-Guided Customization
Abderrachid Hamrani
Daniela Leizaola
Renato Sousa
Jose P. Ponce
Stanley Mathis
David G. Armstrong
Anuradha Godavarty
DiffM
MedIm
36
0
0
24 Apr 2025
DreamO: A Unified Framework for Image Customization
DreamO: A Unified Framework for Image Customization
Chong Mou
Yanze Wu
Wenxu Wu
Zinan Guo
Pengze Zhang
...
Shaojin Wu
S. Zhao
Jian Andrew Zhang
Qian He
Xinglong Wu
44
0
0
23 Apr 2025
Scene-Aware Location Modeling for Data Augmentation in Automotive Object Detection
Scene-Aware Location Modeling for Data Augmentation in Automotive Object Detection
Jens Petersen
Davide Abati
A. Habibian
Auke Wiggers
ViT
3DPC
48
0
0
23 Apr 2025
Multimodal Perception for Goal-oriented Navigation: A Survey
Multimodal Perception for Goal-oriented Navigation: A Survey
I-Tak Ieong
Hao Tang
LM&Ro
LRM
29
0
0
22 Apr 2025
FreeGraftor: Training-Free Cross-Image Feature Grafting for Subject-Driven Text-to-Image Generation
FreeGraftor: Training-Free Cross-Image Feature Grafting for Subject-Driven Text-to-Image Generation
Zebin Yao
Lei Ren
Huixing Jiang
Chen Wei
Xiaojie Wang
Ruifan Li
Fangxiang Feng
DiffM
69
0
0
22 Apr 2025
Twin Co-Adaptive Dialogue for Progressive Image Generation
Twin Co-Adaptive Dialogue for Progressive Image Generation
J. Wang
Yangfan He
Yan Zhong
Xinyuan Song
Jiayi Su
...
Miao Zhang
K. Li
Jiaqi Chen
Tianyu Shi
Xueqian Wang
26
0
0
21 Apr 2025
GIFDL: Generated Image Fluctuation Distortion Learning for Enhancing Steganographic Security
GIFDL: Generated Image Fluctuation Distortion Learning for Enhancing Steganographic Security
Xiangkun Wang
Kejiang Chen
Yuang Qi
Ruiheng Liu
Weiming Zhang
Nenghai Yu
28
0
0
21 Apr 2025
When Cloud Removal Meets Diffusion Model in Remote Sensing
When Cloud Removal Meets Diffusion Model in Remote Sensing
Zhenyu Yu
Mohd Yamani Idna Idris
Pei Wang
DiffM
36
0
0
21 Apr 2025
MirrorVerse: Pushing Diffusion Models to Realistically Reflect the World
MirrorVerse: Pushing Diffusion Models to Realistically Reflect the World
Ankit Dhiman
Manan Shah
R. V. Babu
22
0
0
21 Apr 2025
Emergence and Evolution of Interpretable Concepts in Diffusion Models
Emergence and Evolution of Interpretable Concepts in Diffusion Models
Berk Tinaz
Zalan Fabian
Mahdi Soltanolkotabi
DiffM
19
0
0
21 Apr 2025
A Controllable Appearance Representation for Flexible Transfer and Editing
A Controllable Appearance Representation for Flexible Transfer and Editing
Santiago Jimenez-Navarro
Julia Guerrero-Viu
B. Masiá
DiffM
23
0
0
21 Apr 2025
Text-to-Decision Agent: Learning Generalist Policies from Natural Language Supervision
Text-to-Decision Agent: Learning Generalist Policies from Natural Language Supervision
Shilin Zhang
Zican Hu
Wenhao Wu
Xinyi Xie
Jianxiang Tang
Chunlin Chen
Daoyi Dong
Yu Cheng
Zhenhong Sun
Zhi Wang
OffRL
51
0
0
21 Apr 2025
SUDO: Enhancing Text-to-Image Diffusion Models with Self-Supervised Direct Preference Optimization
SUDO: Enhancing Text-to-Image Diffusion Models with Self-Supervised Direct Preference Optimization
Liang Peng
Boxi Wu
Haoran Cheng
Yibo Zhao
Xiaofei He
29
0
0
20 Apr 2025
Generative Multimodal Pretraining with Discrete Diffusion Timestep Tokens
Generative Multimodal Pretraining with Discrete Diffusion Timestep Tokens
Kaihang Pan
Wang Lin
Zhongqi Yue
Tenglong Ao
Liyu Jia
Wei Zhao
Juncheng Billy Li
Siliang Tang
Hanwang Zhang
39
1
0
20 Apr 2025
DreamID: High-Fidelity and Fast diffusion-based Face Swapping via Triplet ID Group Learning
DreamID: High-Fidelity and Fast diffusion-based Face Swapping via Triplet ID Group Learning
Fulong Ye
Miao Hua
Pengze Zhang
Xinghui Li
Qichao Sun
Songtao Zhao
Qian He
Xinglong Wu
56
0
0
20 Apr 2025
Towards NSFW-Free Text-to-Image Generation via Safety-Constraint Direct Preference Optimization
Towards NSFW-Free Text-to-Image Generation via Safety-Constraint Direct Preference Optimization
Shouwei Ruan
Zhenyu Wu
Yao Huang
Ruochen Zhang
Yitong Sun
Caixin Kang
Xingxing Wei
EGVM
35
0
0
19 Apr 2025
Learning Joint ID-Textual Representation for ID-Preserving Image Synthesis
Learning Joint ID-Textual Representation for ID-Preserving Image Synthesis
Zichuan Liu
Liming Jiang
Qing Yan
Yumin Jia
Hao Kang
Xin Lu
DiffM
29
0
0
19 Apr 2025
LLM-Enabled Style and Content Regularization for Personalized Text-to-Image Generation
LLM-Enabled Style and Content Regularization for Personalized Text-to-Image Generation
Anran Yu
Wei Feng
Y. Zhang
Xiang Li
Lei Meng
Lei Wu
X. Meng
DiffM
22
0
0
19 Apr 2025
Towards Explainable Fake Image Detection with Multi-Modal Large Language Models
Towards Explainable Fake Image Detection with Multi-Modal Large Language Models
Yikun Ji
Y. Hong
Jiahui Zhan
H. Chen
Jun Lan
Huijia Zhu
Weiqiang Wang
L. Zhang
Jianfu Zhang
MLLM
LRM
43
0
0
19 Apr 2025
CLIP-Powered Domain Generalization and Domain Adaptation: A Comprehensive Survey
CLIP-Powered Domain Generalization and Domain Adaptation: A Comprehensive Survey
Jindong Li
Y. Li
Yali Fu
Jiahong Liu
Yixin Liu
Menglin Yang
Irwin King
VLM
36
0
0
19 Apr 2025
Point-Driven Interactive Text and Image Layer Editing Using Diffusion Models
Point-Driven Interactive Text and Image Layer Editing Using Diffusion Models
Zhenyu Yu
Mohd Yamani Idna Idris
Pei Wang
Yuelong Xia
DiffM
24
0
0
18 Apr 2025
Design Topological Materials by Reinforcement Fine-Tuned Generative Model
Design Topological Materials by Reinforcement Fine-Tuned Generative Model
Haosheng Xu
Dongheng Qian
Zhixuan Liu
Yadong Jiang
Jing Wang
29
1
0
17 Apr 2025
Image Editing with Diffusion Models: A Survey
Image Editing with Diffusion Models: A Survey
Jia Wang
Jie Hu
Xiaoqi Ma
Hanghang Ma
Xiaoming Wei
Enhua Wu
66
0
0
17 Apr 2025
ARAP-GS: Drag-driven As-Rigid-As-Possible 3D Gaussian Splatting Editing with Diffusion Prior
ARAP-GS: Drag-driven As-Rigid-As-Possible 3D Gaussian Splatting Editing with Diffusion Prior
Xiao Han
RunZe Tian
Yifei Tong
Fenggen Yu
Dingyao Liu
Yan Zhang
3DGS
31
0
0
17 Apr 2025
Set You Straight: Auto-Steering Denoising Trajectories to Sidestep Unwanted Concepts
Set You Straight: Auto-Steering Denoising Trajectories to Sidestep Unwanted Concepts
Leyang Li
Shilin Lu
Yan Ren
A. Kong
DiffM
40
1
0
17 Apr 2025
Mask Image Watermarking
Mask Image Watermarking
Runyi Hu
Jie Zhang
Shiqian Zhao
Nils Lukas
Jiwei Li
Qing-Wu Guo
Han Qiu
Tianwei Zhang
20
0
0
17 Apr 2025
Recent Advance in 3D Object and Scene Generation: A Survey
Recent Advance in 3D Object and Scene Generation: A Survey
Xiang Tang
Ruotong Li
Xiaopeng Fan
75
0
0
16 Apr 2025
EgoExo-Gen: Ego-centric Video Prediction by Watching Exo-centric Videos
EgoExo-Gen: Ego-centric Video Prediction by Watching Exo-centric Videos
J. Xu
Y. Huang
Baoqi Pei
Junlin Hou
Qingqiu Li
Guo Chen
Y. Zhang
Rui Feng
Weidi Xie
DiffM
46
0
0
16 Apr 2025
The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation
The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation
Bingjie Gao
Xinyu Gao
Xiaoxue Wu
Yujie Zhou
Yu Qiao
Li Niu
Xinyuan Chen
Yaohui Wang
71
0
0
16 Apr 2025
Securing the Skies: A Comprehensive Survey on Anti-UAV Methods, Benchmarking, and Future Directions
Securing the Skies: A Comprehensive Survey on Anti-UAV Methods, Benchmarking, and Future Directions
Yifei Dong
Fengyi Wu
Sanjian Zhang
Guangyu Chen
Yuzhi Hu
...
Jingdong Sun
Siyu Huang
Feng Liu
Qi Dai
Zhi-Qi Cheng
39
0
0
16 Apr 2025
VGDFR: Diffusion-based Video Generation with Dynamic Latent Frame Rate
VGDFR: Diffusion-based Video Generation with Dynamic Latent Frame Rate
Zhihang Yuan
Rui Xie
Yuzhang Shang
H. Zhang
Siyuan Wang
Shengen Yan
Guohao Dai
Yu Wang
DiffM
VGen
42
0
0
16 Apr 2025
PT-Mark: Invisible Watermarking for Text-to-image Diffusion Models via Semantic-aware Pivotal Tuning
PT-Mark: Invisible Watermarking for Text-to-image Diffusion Models via Semantic-aware Pivotal Tuning
Y. Wang
Huiyu Xu
Zhibo Wang
Jiacheng Du
Z. Li
Yiming Li
Qiu Wang
Kui Ren
WIGM
47
0
0
15 Apr 2025
Token-Level Constraint Boundary Search for Jailbreaking Text-to-Image Models
Token-Level Constraint Boundary Search for Jailbreaking Text-to-Image Models
J. Liu
Zhaoxin Wang
Handing Wang
Cong Tian
Yaochu Jin
26
0
0
15 Apr 2025
Hierarchical and Step-Layer-Wise Tuning of Attention Specialty for Multi-Instance Synthesis in Diffusion Transformers
Hierarchical and Step-Layer-Wise Tuning of Attention Specialty for Multi-Instance Synthesis in Diffusion Transformers
Chunyang Zhang
Zhenhong Sun
Zhicheng Zhang
Junyan Wang
Yu Zhang
Dong Gong
H. Mo
Daoyi Dong
33
0
0
14 Apr 2025
Omni-Dish: Photorealistic and Faithful Image Generation and Editing for Arbitrary Chinese Dishes
Omni-Dish: Photorealistic and Faithful Image Generation and Editing for Arbitrary Chinese Dishes
Huijie Liu
Bingcan Wang
Jie Hu
Xiaoming Wei
Guoliang Kang
61
0
0
14 Apr 2025
Decoupled Diffusion Sparks Adaptive Scene Generation
Decoupled Diffusion Sparks Adaptive Scene Generation
Yunsong Zhou
Naisheng Ye
William Ljungbergh
Tianyu Li
Jiazhi Yang
Zetong Yang
Hongzi Zhu
Christoffer Petersson
Hongyang Li
36
1
0
14 Apr 2025
H3AE: High Compression, High Speed, and High Quality AutoEncoder for Video Diffusion Models
H3AE: High Compression, High Speed, and High Quality AutoEncoder for Video Diffusion Models
Yushu Wu
Yanyu Li
Ivan Skorokhodov
Anil Kag
Willi Menapace
Sharath Girish
Aliaksandr Siarohin
Yanzhi Wang
Sergey Tulyakov
DiffM
VGen
35
0
0
14 Apr 2025
Anchor Token Matching: Implicit Structure Locking for Training-free AR Image Editing
Anchor Token Matching: Implicit Structure Locking for Training-free AR Image Editing
Taihang Hu
Linxuan Li
Kai Wang
Yaxing Wang
Jian Yang
Ming-Ming Cheng
DiffM
VGen
23
0
0
14 Apr 2025
ESCT3D: Efficient and Selectively Controllable Text-Driven 3D Content Generation with Gaussian Splatting
ESCT3D: Efficient and Selectively Controllable Text-Driven 3D Content Generation with Gaussian Splatting
Huiqi Wu
Jianbo Mei
Yingjie Huang
Yining Xu
Jingjiao You
Yilong Liu
Li Yao
3DGS
27
0
0
14 Apr 2025
Efficient Generative Model Training via Embedded Representation Warmup
Efficient Generative Model Training via Embedded Representation Warmup
Deyuan Liu
Peng Sun
Xufeng Li
Tao Lin
19
0
0
14 Apr 2025
GaussVideoDreamer: 3D Scene Generation with Video Diffusion and Inconsistency-Aware Gaussian Splatting
GaussVideoDreamer: 3D Scene Generation with Video Diffusion and Inconsistency-Aware Gaussian Splatting
Junlin Hao
Peiheng Wang
Haoyang Wang
Xinggong Zhang
Zongming Guo
3DGS
VGen
55
0
0
14 Apr 2025
InstructEngine: Instruction-driven Text-to-Image Alignment
InstructEngine: Instruction-driven Text-to-Image Alignment
Xingyu Lu
Y. Hu
Y. Zhang
Kaiyu Jiang
Changyi Liu
...
Bin Wen
C. Yuan
Fan Yang
Tingting Gao
Di Zhang
34
0
0
14 Apr 2025
Scalable Motion In-betweening via Diffusion and Physics-Based Character Adaptation
Scalable Motion In-betweening via Diffusion and Physics-Based Character Adaptation
Jia Qin
DiffM
VGen
36
0
0
13 Apr 2025
D$^2$iT: Dynamic Diffusion Transformer for Accurate Image Generation
D2^22iT: Dynamic Diffusion Transformer for Accurate Image Generation
Weinan Jia
Mengqi Huang
Nan Chen
Lei Zhang
Zhendong Mao
21
0
0
13 Apr 2025
SD-ReID: View-aware Stable Diffusion for Aerial-Ground Person Re-Identification
SD-ReID: View-aware Stable Diffusion for Aerial-Ground Person Re-Identification
Xiang Hu
Pingping Zhang
Yuhao Wang
Bin Yan
Huchuan Lu
23
0
0
13 Apr 2025
MASH: Masked Anchored SpHerical Distances for 3D Shape Representation and Generation
MASH: Masked Anchored SpHerical Distances for 3D Shape Representation and Generation
Changhao Li
Yu Xin
Xiaowei Zhou
Ariel Shamir
Hao Zhang
Ligang Liu
R. Hu
48
0
0
12 Apr 2025
Generating Fine Details of Entity Interactions
Generating Fine Details of Entity Interactions
Xinyi Gu
Jiayuan Mao
32
0
0
11 Apr 2025
TokenMotion: Decoupled Motion Control via Token Disentanglement for Human-centric Video Generation
TokenMotion: Decoupled Motion Control via Token Disentanglement for Human-centric Video Generation
Ruineng Li
Daitao Xing
Huiming Sun
Yuanzhou Ha
Jinglin Shen
C. Ho
DiffM
VGen
37
0
0
11 Apr 2025
ContrastiveGaussian: High-Fidelity 3D Generation with Contrastive Learning and Gaussian Splatting
ContrastiveGaussian: High-Fidelity 3D Generation with Contrastive Learning and Gaussian Splatting
J. H. Liu
Enpei Huang
Dongxing Mao
Hui Zhang
Xinyuan Song
Yongxin Ni
3DGS
48
0
0
10 Apr 2025
Previous
12345...858687
Next