ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.11487
  4. Cited By
Photorealistic Text-to-Image Diffusion Models with Deep Language
  Understanding

Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding

Neural Information Processing Systems (NeurIPS), 2022
23 May 2022
Chitwan Saharia
William Chan
Saurabh Saxena
Lala Li
Jay Whang
Emily L. Denton
Seyed Kamyar Seyed Ghasemipour
Burcu Karagol Ayan
S. S. Mahdavi
Raphael Gontijo-Lopes
Tim Salimans
Jonathan Ho
David J Fleet
Mohammad Norouzi
    VLM
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)

Papers citing "Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding"

50 / 5,041 papers shown
DOLCE: A Model-Based Probabilistic Diffusion Framework for Limited-Angle
  CT Reconstruction
DOLCE: A Model-Based Probabilistic Diffusion Framework for Limited-Angle CT ReconstructionIEEE International Conference on Computer Vision (ICCV), 2022
Jiaming Liu
Rushil Anirudh
Jayaraman J. Thiagarajan
Stewart He
K. A. Mohan
Ulugbek S. Kamilov
Hyojin Kim
MedImDiffM
178
100
0
22 Nov 2022
SadTalker: Learning Realistic 3D Motion Coefficients for Stylized
  Audio-Driven Single Image Talking Face Animation
SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face AnimationComputer Vision and Pattern Recognition (CVPR), 2022
Wenxuan Zhang
Xiaodong Cun
Xuan Wang
Yong Zhang
Xiaodong Shen
Yu-Xiao Guo
Ying Shan
Haiwei Yang
VGen
246
408
0
22 Nov 2022
Human Evaluation of Text-to-Image Models on a Multi-Task Benchmark
Human Evaluation of Text-to-Image Models on a Multi-Task Benchmark
Vitali Petsiuk
Alexander E. Siemenn
Saisamrit Surbehera
Zad Chin
Keith Tyser
...
Ori Kerret
Tonio Buonassisi
Kate Saenko
Armando Solar-Lezama
Iddo Drori
VLM
117
46
0
22 Nov 2022
SinFusion: Training Diffusion Models on a Single Image or Video
SinFusion: Training Diffusion Models on a Single Image or VideoInternational Conference on Machine Learning (ICML), 2022
Yaniv Nikankin
Niv Haim
Michal Irani
VGen
377
78
0
21 Nov 2022
SceneComposer: Any-Level Semantic Image Synthesis
SceneComposer: Any-Level Semantic Image SynthesisComputer Vision and Pattern Recognition (CVPR), 2022
Yu Zeng
Zhe Lin
Jianming Zhang
Qing Liu
John Collomosse
Jason Kuen
Vishal M. Patel
DiffM
150
55
0
21 Nov 2022
Exploring Discrete Diffusion Models for Image Captioning
Exploring Discrete Diffusion Models for Image Captioning
Zixin Zhu
Yixuan Wei
Jianfeng Wang
Zhe Gan
Zheng Zhang
Le Wang
G. Hua
Lijuan Wang
Zicheng Liu
Han Hu
DiffMVLM
270
32
0
21 Nov 2022
VectorFusion: Text-to-SVG by Abstracting Pixel-Based Diffusion Models
VectorFusion: Text-to-SVG by Abstracting Pixel-Based Diffusion ModelsComputer Vision and Pattern Recognition (CVPR), 2022
Ajay Jain
Amber Xie
Pieter Abbeel
DiffM
214
119
0
21 Nov 2022
Video Background Music Generation: Dataset, Method and Evaluation
Video Background Music Generation: Dataset, Method and EvaluationIEEE International Conference on Computer Vision (ICCV), 2022
Le Zhuo
Zhaokai Wang
Baisen Wang
Yue Liao
Chenxi Bao
Stanley Peng
Miao Lu
Xiaobo Li
Fei Fang
Si Liu
VGen
281
46
0
21 Nov 2022
Investigating Prompt Engineering in Diffusion Models
Investigating Prompt Engineering in Diffusion Models
Sam Witteveen
Martin Andrews
128
80
0
21 Nov 2022
Diffusion-Based Scene Graph to Image Generation with Masked Contrastive
  Pre-Training
Diffusion-Based Scene Graph to Image Generation with Masked Contrastive Pre-Training
Ling Yang
Zhilin Huang
Yang Song
Shenda Hong
Ge Li
Wentao Zhang
Tengjiao Wang
Guohao Li
Ming-Hsuan Yang
241
70
0
21 Nov 2022
MagicVideo: Efficient Video Generation With Latent Diffusion Models
MagicVideo: Efficient Video Generation With Latent Diffusion Models
Daquan Zhou
Weimin Wang
Hanshu Yan
Weiwei Lv
Yizhe Zhu
Jiashi Feng
DiffMVGen
402
470
0
20 Nov 2022
Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models
Synthesizing Coherent Story with Auto-Regressive Latent Diffusion ModelsIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Xichen Pan
Pengda Qin
Yuhong Li
Hui Xue
Wenhu Chen
DiffM
245
79
0
20 Nov 2022
IC3D: Image-Conditioned 3D Diffusion for Shape Generation
IC3D: Image-Conditioned 3D Diffusion for Shape Generation
Cristian Sbrolli
Paolo Cudrano
Matteo Frosi
Matteo Matteucci
DiffM
351
7
0
20 Nov 2022
DiffStyler: Controllable Dual Diffusion for Text-Driven Image
  Stylization
DiffStyler: Controllable Dual Diffusion for Text-Driven Image StylizationIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2022
Nisha Huang
Yuxin Zhang
Fan Tang
Chongyang Ma
Haibin Huang
Yong Zhang
Weiming Dong
Changsheng Xu
DiffM
288
65
0
19 Nov 2022
EDGE: Editable Dance Generation From Music
EDGE: Editable Dance Generation From MusicComputer Vision and Pattern Recognition (CVPR), 2022
Jo-Han Tseng
Rodrigo Castellon
Chenxi Liu
354
336
0
19 Nov 2022
Magic3D: High-Resolution Text-to-3D Content Creation
Magic3D: High-Resolution Text-to-3D Content CreationComputer Vision and Pattern Recognition (CVPR), 2022
Chen-Hsuan Lin
Jun Gao
Luming Tang
Towaki Takikawa
Fangyin Wei
Xun Huang
Karsten Kreis
Sanja Fidler
Ming-Yuan Liu
Nayeon Lee
393
1,446
0
18 Nov 2022
Invariant Learning via Diffusion Dreamed Distribution Shifts
Invariant Learning via Diffusion Dreamed Distribution Shifts
Priyatham Kattakinda
Alexander Levine
Soheil Feizi
DiffM
141
11
0
18 Nov 2022
RenderDiffusion: Image Diffusion for 3D Reconstruction, Inpainting and
  Generation
RenderDiffusion: Image Diffusion for 3D Reconstruction, Inpainting and GenerationComputer Vision and Pattern Recognition (CVPR), 2022
Titas Anciukevicius
Zexiang Xu
Matthew Fisher
Paul Henderson
Hakan Bilen
Niloy J. Mitra
Paul Guerrero
282
203
0
17 Nov 2022
InstructPix2Pix: Learning to Follow Image Editing Instructions
InstructPix2Pix: Learning to Follow Image Editing InstructionsComputer Vision and Pattern Recognition (CVPR), 2022
Tim Brooks
Aleksander Holynski
Alexei A. Efros
DiffM
879
2,543
0
17 Nov 2022
Conffusion: Confidence Intervals for Diffusion Models
Conffusion: Confidence Intervals for Diffusion Models
Eliahu Horwitz
Yedid Hoshen
DiffM
209
33
0
17 Nov 2022
Null-text Inversion for Editing Real Images using Guided Diffusion
  Models
Null-text Inversion for Editing Real Images using Guided Diffusion Models
Ron Mokady
Amir Hertz
Kfir Aberman
Yael Pritch
Daniel Cohen-Or
DiffM
274
957
0
17 Nov 2022
DiffusionDet: Diffusion Model for Object Detection
DiffusionDet: Diffusion Model for Object DetectionIEEE International Conference on Computer Vision (ICCV), 2022
Shoufa Chen
Pei Sun
Yibing Song
Ping Luo
405
650
0
17 Nov 2022
Listen, Denoise, Action! Audio-Driven Motion Synthesis with Diffusion
  Models
Listen, Denoise, Action! Audio-Driven Motion Synthesis with Diffusion ModelsACM Transactions on Graphics (TOG), 2022
Simon Alexanderson
Rajmund Nagy
Jonas Beskow
G. Henter
DiffMVGen
289
234
0
17 Nov 2022
Is the Elephant Flying? Resolving Ambiguities in Text-to-Image
  Generative Models
Is the Elephant Flying? Resolving Ambiguities in Text-to-Image Generative Models
Ninareh Mehrabi
Palash Goyal
Apurv Verma
Jwala Dhamala
Varun Kumar
Qian Hu
Kai-Wei Chang
R. Zemel
Aram Galstyan
Rahul Gupta
204
8
0
17 Nov 2022
GLAMI-1M: A Multilingual Image-Text Fashion Dataset
GLAMI-1M: A Multilingual Image-Text Fashion DatasetBritish Machine Vision Conference (BMVC), 2022
Vaclav Kosar
A. Hoskovec
Milan Šulc
Radek Bartyzal
VLM
169
6
0
17 Nov 2022
A Stable, Fast, and Fully Automatic Learning Algorithm for Predictive
  Coding Networks
A Stable, Fast, and Fully Automatic Learning Algorithm for Predictive Coding NetworksInternational Conference on Learning Representations (ICLR), 2022
Tommaso Salvatori
Yuhang Song
Yordan Yordanov
Beren Millidge
Zheng R. Xu
Lei Sha
Cornelius Emde
Rafal Bogacz
Thomas Lukasiewicz
332
18
0
16 Nov 2022
Versatile Diffusion: Text, Images and Variations All in One Diffusion
  Model
Versatile Diffusion: Text, Images and Variations All in One Diffusion ModelIEEE International Conference on Computer Vision (ICCV), 2022
Xingqian Xu
Zinan Lin
Eric Zhang
Kai Wang
Humphrey Shi
DiffM
562
248
0
15 Nov 2022
Will Large-scale Generative Models Corrupt Future Datasets?
Will Large-scale Generative Models Corrupt Future Datasets?IEEE International Conference on Computer Vision (ICCV), 2022
Ryuichiro Hataya
Han Bao
Hiromi Arai
245
71
0
15 Nov 2022
Cross-Reality Re-Rendering: Manipulating between Digital and Physical
  Realities
Cross-Reality Re-Rendering: Manipulating between Digital and Physical Realities
Siddhartha Datta
244
2
0
15 Nov 2022
Direct Inversion: Optimization-Free Text-Driven Real Image Editing with
  Diffusion Models
Direct Inversion: Optimization-Free Text-Driven Real Image Editing with Diffusion Models
Adham Elarabawy
Harish Kamath
Samuel Denton
DiffM
132
21
0
15 Nov 2022
Extreme Generative Image Compression by Learning Text Embedding from
  Diffusion Models
Extreme Generative Image Compression by Learning Text Embedding from Diffusion Models
Zhihong Pan
Xiaoxia Zhou
Hao Tian
DiffM
204
30
0
14 Nov 2022
Arbitrary Style Guidance for Enhanced Diffusion-Based Text-to-Image
  Generation
Arbitrary Style Guidance for Enhanced Diffusion-Based Text-to-Image GenerationIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Zhihong Pan
Xiaoxia Zhou
Hao Tian
DiffM
187
15
0
14 Nov 2022
EVA: Exploring the Limits of Masked Visual Representation Learning at
  Scale
EVA: Exploring the Limits of Masked Visual Representation Learning at ScaleComputer Vision and Pattern Recognition (CVPR), 2022
Yuxin Fang
Wen Wang
Binhui Xie
Quan-Sen Sun
Ledell Yu Wu
Xinggang Wang
Tiejun Huang
Xinlong Wang
Yue Cao
VLMCLIP
621
907
0
14 Nov 2022
Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures
Latent-NeRF for Shape-Guided Generation of 3D Shapes and TexturesComputer Vision and Pattern Recognition (CVPR), 2022
G. Metzer
Elad Richardson
Or Patashnik
Raja Giryes
Daniel Cohen-Or
DiffM
529
562
0
14 Nov 2022
Language models are good pathologists: using attention-based sequence
  reduction and text-pretrained transformers for efficient WSI classification
Language models are good pathologists: using attention-based sequence reduction and text-pretrained transformers for efficient WSI classification
Juan Pisula
Katarzyna Bozek
VLMMedIm
322
4
0
14 Nov 2022
A Novel Sampling Scheme for Text- and Image-Conditional Image Synthesis
  in Quantized Latent Spaces
A Novel Sampling Scheme for Text- and Image-Conditional Image Synthesis in Quantized Latent Spaces
Dominic Rampas
Pablo Pernias
Marc Aubreville
DiffM
142
14
0
14 Nov 2022
Large-Scale Bidirectional Training for Zero-Shot Image Captioning
Large-Scale Bidirectional Training for Zero-Shot Image Captioning
Taehoon Kim
Mark A Marsden
Pyunghwan Ahn
Sangyun Kim
Sihaeng Lee
Alessandra Sala
S. Kim
VLM
220
5
0
13 Nov 2022
Design of Unmanned Air Vehicles Using Transformer Surrogate Models
Design of Unmanned Air Vehicles Using Transformer Surrogate Models
Adam D. Cobb
Anirban Roy
Daniel Elenius
Susmit Jha
AI4CE
176
2
0
11 Nov 2022
Efficient HLA imputation from sequential SNPs data by Transformer
Efficient HLA imputation from sequential SNPs data by TransformerJournal of Human Genetics (J Hum Genet), 2022
Kaho Tanaka
Kosuke Kato
Naoki Nonaka
J. Seita
BDL
130
8
0
11 Nov 2022
SSGVS: Semantic Scene Graph-to-Video Synthesis
SSGVS: Semantic Scene Graph-to-Video Synthesis
Yuren Cong
Jinhui Yi
Bodo Rosenhahn
M. Yang
247
8
0
11 Nov 2022
Safe Latent Diffusion: Mitigating Inappropriate Degeneration in
  Diffusion Models
Safe Latent Diffusion: Mitigating Inappropriate Degeneration in Diffusion ModelsComputer Vision and Pattern Recognition (CVPR), 2022
P. Schramowski
Manuel Brack
Bjorn Deiseroth
Kristian Kersting
511
454
0
09 Nov 2022
DiffPhase: Generative Diffusion-based STFT Phase Retrieval
DiffPhase: Generative Diffusion-based STFT Phase RetrievalIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Tal Peer
Simon Welker
Timo Gerkmann
DiffM
180
14
0
08 Nov 2022
Self-conditioned Embedding Diffusion for Text Generation
Self-conditioned Embedding Diffusion for Text Generation
Robin Strudel
Corentin Tallec
Florent Altché
Yilun Du
Yaroslav Ganin
...
Will Grathwohl
Nikolay Savinov
Sander Dieleman
Laurent Sifre
Rémi Leblond
DiffM
239
107
0
08 Nov 2022
Astronomia ex machina: a history, primer, and outlook on neural networks
  in astronomy
Astronomia ex machina: a history, primer, and outlook on neural networks in astronomyRoyal Society Open Science (RSOS), 2022
Michael J. Smith
James E. Geach
213
49
0
07 Nov 2022
Medical Diffusion: Denoising Diffusion Probabilistic Models for 3D
  Medical Image Generation
Medical Diffusion: Denoising Diffusion Probabilistic Models for 3D Medical Image Generation
Firas Khader
Gustav Mueller-Franzes
Soroosh Tayebi Arasteh
T. Han
Christoph Haarburger
...
Johannes Stegmaier
Christiane Kuhl
S. Nebelung
Jakob Nikolas Kather
Daniel Truhn
DiffMMedIm
486
81
0
07 Nov 2022
Rickrolling the Artist: Injecting Backdoors into Text Encoders for
  Text-to-Image Synthesis
Rickrolling the Artist: Injecting Backdoors into Text Encoders for Text-to-Image SynthesisIEEE International Conference on Computer Vision (ICCV), 2022
Lukas Struppek
Dominik Hintersdorf
Kristian Kersting
SILM
467
54
0
04 Nov 2022
Evaluating a Synthetic Image Dataset Generated with Stable Diffusion
Evaluating a Synthetic Image Dataset Generated with Stable Diffusion
Andreas Stöckl
219
27
0
03 Nov 2022
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert
  Denoisers
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers
Yogesh Balaji
Seungjun Nah
Xun Huang
Arash Vahdat
Jiaming Song
...
Timo Aila
S. Laine
Bryan Catanzaro
Tero Karras
Xuan Li
VLMMoE
670
990
0
02 Nov 2022
DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models
DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic ModelsMachine Intelligence Research (MIR), 2022
Cheng Lu
Yuhao Zhou
Fan Bao
Jianfei Chen
Chongxuan Li
Jun Zhu
DiffM
838
837
0
02 Nov 2022
MedSegDiff: Medical Image Segmentation with Diffusion Probabilistic
  Model
MedSegDiff: Medical Image Segmentation with Diffusion Probabilistic ModelInternational Conference on Medical Imaging with Deep Learning (MIDL), 2022
Junde Wu
Rao Fu
Huihui Fang
Yu Zhang
Yehui Yang
Haoyi Xiong
Huiying Liu
Yanwu Xu
MedImVLMDiffM
504
386
0
01 Nov 2022
Previous
123...10010196979899
Next
Page 97 of 101
Pageof 101