ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.06125
  4. Cited By
Hierarchical Text-Conditional Image Generation with CLIP Latents

Hierarchical Text-Conditional Image Generation with CLIP Latents

13 April 2022
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
    VLM
    DiffM
ArXivPDFHTML

Papers citing "Hierarchical Text-Conditional Image Generation with CLIP Latents"

50 / 4,735 papers shown
Title
A Survey on Generative Diffusion Model
A Survey on Generative Diffusion Model
Hanqun Cao
Cheng Tan
Zhangyang Gao
Yilun Xu
Guangyong Chen
Pheng-Ann Heng
Stan Z. Li
MedIm
37
195
0
06 Sep 2022
Diffusion Models: A Comprehensive Survey of Methods and Applications
Diffusion Models: A Comprehensive Survey of Methods and Applications
Ling Yang
Zhilong Zhang
Yingxia Shao
Shenda Hong
Runsheng Xu
Yue Zhao
Wentao Zhang
Bin Cui
Ming-Hsuan Yang
DiffM
MedIm
215
1,277
0
02 Sep 2022
Zero-Shot Multi-Modal Artist-Controlled Retrieval and Exploration of 3D
  Object Sets
Zero-Shot Multi-Modal Artist-Controlled Retrieval and Exploration of 3D Object Sets
Kristofer Schlachter
Benjamin Ahlbrand
Zhu Wang
V. Ortenzi
Ken Perlin
DiffM
3DV
6
7
0
01 Sep 2022
FLAME: Free-form Language-based Motion Synthesis & Editing
FLAME: Free-form Language-based Motion Synthesis & Editing
Jihoon Kim
Jiseob Kim
Sungjoon Choi
VGen
17
194
0
01 Sep 2022
Large-Scale Auto-Regressive Modeling Of Street Networks
Large-Scale Auto-Regressive Modeling Of Street Networks
Michael Birsak
Tom Kelly
W. Para
Peter Wonka
GNN
AI4TS
9
5
0
01 Sep 2022
MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model
MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model
Mingyuan Zhang
Zhongang Cai
Liang Pan
Fangzhou Hong
Xinying Guo
Lei Yang
Ziwei Liu
DiffM
VGen
18
534
0
31 Aug 2022
Let us Build Bridges: Understanding and Extending Diffusion Generative
  Models
Let us Build Bridges: Understanding and Extending Diffusion Generative Models
Xingchao Liu
Lemeng Wu
Mao Ye
Qiang Liu
DiffM
12
78
0
31 Aug 2022
Deep Generative Modeling on Limited Data with Regularization by
  Nontransferable Pre-trained Models
Deep Generative Modeling on Limited Data with Regularization by Nontransferable Pre-trained Models
Yong Zhong
Hongtao Liu
Xiaodong Liu
Fan Bao
Weiran Shen
Chongxuan Li
AI4CE
11
4
0
30 Aug 2022
Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis
Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis
Wanshu Fan
Yen-Chun Chen
Dongdong Chen
Yu Cheng
Lu Yuan
Yu-Chiang Frank Wang
DiffM
13
90
0
29 Aug 2022
Efficient Vision-Language Pretraining with Visual Concepts and
  Hierarchical Alignment
Efficient Vision-Language Pretraining with Visual Concepts and Hierarchical Alignment
Mustafa Shukor
Guillaume Couairon
Matthieu Cord
VLM
CLIP
19
26
0
29 Aug 2022
LogicRank: Logic Induced Reranking for Generative Text-to-Image Systems
LogicRank: Logic Induced Reranking for Generative Text-to-Image Systems
Bjorn Deiseroth
P. Schramowski
Hikaru Shindo
D. Dhami
Kristian Kersting
EGVM
DiffM
8
1
0
29 Aug 2022
Grounded Affordance from Exocentric View
Grounded Affordance from Exocentric View
Hongcheng Luo
Wei Zhai
Jing Zhang
Yang Cao
Dacheng Tao
13
17
0
28 Aug 2022
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for
  Subject-Driven Generation
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation
Nataniel Ruiz
Yuanzhen Li
Varun Jampani
Yael Pritch
Michael Rubinstein
Kfir Aberman
12
2,688
0
25 Aug 2022
Understanding Diffusion Models: A Unified Perspective
Understanding Diffusion Models: A Unified Perspective
Calvin Luo
DiffM
9
332
0
25 Aug 2022
Comprehensive Dataset of Face Manipulations for Development and
  Evaluation of Forensic Tools
Comprehensive Dataset of Face Manipulations for Development and Evaluation of Forensic Tools
Brian DeCann
K. Trapeznikov
CVBM
13
2
0
24 Aug 2022
PromptFL: Let Federated Participants Cooperatively Learn Prompts Instead
  of Models -- Federated Learning in Age of Foundation Model
PromptFL: Let Federated Participants Cooperatively Learn Prompts Instead of Models -- Federated Learning in Age of Foundation Model
Tao Guo
Song Guo
Junxiao Wang
Wenchao Xu
FedML
VLM
LRM
8
110
0
24 Aug 2022
Bidirectional Contrastive Split Learning for Visual Question Answering
Bidirectional Contrastive Split Learning for Visual Question Answering
Yuwei Sun
H. Ochiai
16
2
0
24 Aug 2022
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors,
  and Lessons Learned
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
Deep Ganguli
Liane Lovitt
John Kernion
Amanda Askell
Yuntao Bai
...
Nicholas Joseph
Sam McCandlish
C. Olah
Jared Kaplan
Jack Clark
218
441
0
23 Aug 2022
How good are deep models in understanding the generated images?
How good are deep models in understanding the generated images?
Ali Borji
OOD
11
6
0
23 Aug 2022
Learning More May Not Be Better: Knowledge Transferability in Vision and
  Language Tasks
Learning More May Not Be Better: Knowledge Transferability in Vision and Language Tasks
Tianwei Chen
Noa Garcia
Mayu Otani
Chenhui Chu
Yuta Nakashima
Hajime Nagahara
VLM
22
0
0
23 Aug 2022
Cold Diffusion: Inverting Arbitrary Image Transforms Without Noise
Cold Diffusion: Inverting Arbitrary Image Transforms Without Noise
Arpit Bansal
Eitan Borgnia
Hong-Min Chu
Jie S. Li
Hamid Kazemi
Furong Huang
Micah Goldblum
Jonas Geiping
Tom Goldstein
VLM
DiffM
24
262
0
19 Aug 2022
Text to Image Generation: Leaving no Language Behind
Text to Image Generation: Leaving no Language Behind
Pedro Reviriego
Elena Merino-Gómez
VLM
6
13
0
19 Aug 2022
Pathway to Future Symbiotic Creativity
Pathway to Future Symbiotic Creativity
Yi-Ting Guo
Qi-fei Liu
Jie Chen
Wei Xue
Jie Fu
...
Fernando Rosas
Jeffrey Shaw
Xing Wu
Jiji Zhang
Jianliang Xu
13
0
0
18 Aug 2022
Discovering Bugs in Vision Models using Off-the-shelf Image Generation
  and Captioning
Discovering Bugs in Vision Models using Off-the-shelf Image Generation and Captioning
Olivia Wiles
Isabela Albuquerque
Sven Gowal
VLM
30
44
0
18 Aug 2022
Enhancing Diffusion-Based Image Synthesis with Robust Classifier
  Guidance
Enhancing Diffusion-Based Image Synthesis with Robust Classifier Guidance
Bahjat Kawar
Roy Ganz
Michael Elad
DiffM
13
38
0
18 Aug 2022
Multimodal foundation models are better simulators of the human brain
Multimodal foundation models are better simulators of the human brain
Haoyu Lu
Qiongyi Zhou
Nanyi Fei
Zhiwu Lu
Mingyu Ding
...
Changde Du
Xin Zhao
Haoran Sun
Huiguang He
J. Wen
AI4CE
18
13
0
17 Aug 2022
ILLUME: Rationalizing Vision-Language Models through Human Interactions
ILLUME: Rationalizing Vision-Language Models through Human Interactions
Manuel Brack
P. Schramowski
Bjorn Deiseroth
Kristian Kersting
VLM
MLLM
6
3
0
17 Aug 2022
Applying Regularized Schrödinger-Bridge-Based Stochastic Process in
  Generative Modeling
Applying Regularized Schrödinger-Bridge-Based Stochastic Process in Generative Modeling
Ki-Ung Song
DiffM
17
7
0
15 Aug 2022
Recognition of All Categories of Entities by AI
Recognition of All Categories of Entities by AI
Hiroshi Yamakawa
Yutaka Matsuo
12
0
0
13 Aug 2022
Layout-Bridging Text-to-Image Synthesis
Layout-Bridging Text-to-Image Synthesis
Jiadong Liang
Wenjie Pei
Feng Lu
EGVM
10
15
0
12 Aug 2022
Language-Guided Face Animation by Recurrent StyleGAN-based Generator
Language-Guided Face Animation by Recurrent StyleGAN-based Generator
Tiankai Hang
Huan Yang
Bei Liu
Jianlong Fu
Xin Geng
B. Guo
VGen
13
13
0
11 Aug 2022
Quality Not Quantity: On the Interaction between Dataset Design and
  Robustness of CLIP
Quality Not Quantity: On the Interaction between Dataset Design and Robustness of CLIP
Thao Nguyen
Gabriel Ilharco
Mitchell Wortsman
Sewoong Oh
Ludwig Schmidt
CLIP
VLM
25
97
0
10 Aug 2022
Txt2Img-MHN: Remote Sensing Image Generation from Text Using Modern
  Hopfield Networks
Txt2Img-MHN: Remote Sensing Image Generation from Text Using Modern Hopfield Networks
Yonghao Xu
Weikang Yu
Pedram Ghamisi
Michael K Kopp
Sepp Hochreiter
19
31
0
08 Aug 2022
CLIP-based Neural Neighbor Style Transfer for 3D Assets
CLIP-based Neural Neighbor Style Transfer for 3D Assets
Shailesh Mishra
Jonathan Granskog
CLIP
3DH
3DPC
4
7
0
08 Aug 2022
Analog Bits: Generating Discrete Data using Diffusion Models with
  Self-Conditioning
Analog Bits: Generating Discrete Data using Diffusion Models with Self-Conditioning
Ting-Li Chen
Ruixiang Zhang
Geoffrey E. Hinton
DiffM
27
209
0
08 Aug 2022
Sampling Based On Natural Image Statistics Improves Local Surrogate
  Explainers
Sampling Based On Natural Image Statistics Improves Local Surrogate Explainers
Ricardo Kleinlein
Alexander Hepburn
Raúl Santos-Rodríguez
Fernando Fernández-Martínez
AAML
FAtt
11
2
0
08 Aug 2022
Creative Wand: A System to Study Effects of Communications in
  Co-Creative Settings
Creative Wand: A System to Study Effects of Communications in Co-Creative Settings
Zhiyu Lin
Rohan Agarwal
Mark O. Riedl
12
7
0
04 Aug 2022
Adversarial Attacks on Image Generation With Made-Up Words
Adversarial Attacks on Image Generation With Made-Up Words
Raphael Milliere
16
38
0
04 Aug 2022
Masked Vision and Language Modeling for Multi-modal Representation
  Learning
Masked Vision and Language Modeling for Multi-modal Representation Learning
Gukyeong Kwon
Zhaowei Cai
Avinash Ravichandran
Erhan Bas
Rahul Bhotika
Stefano Soatto
22
66
0
03 Aug 2022
Pyramidal Denoising Diffusion Probabilistic Models
Pyramidal Denoising Diffusion Probabilistic Models
Dohoon Ryu
Jong Chul Ye
10
25
0
03 Aug 2022
DALLE-URBAN: Capturing the urban design expertise of large text to image
  transformers
DALLE-URBAN: Capturing the urban design expertise of large text to image transformers
Sachith Seneviratne
Damith A. Senanayake
Sanka Rasnayaka
Rajith Vidanaarachchi
Jason Thompson
ViT
4
17
0
03 Aug 2022
Prompt-to-Prompt Image Editing with Cross Attention Control
Prompt-to-Prompt Image Editing with Cross Attention Control
Amir Hertz
Ron Mokady
J. Tenenbaum
Kfir Aberman
Yael Pritch
Daniel Cohen-Or
DiffM
15
1,680
0
02 Aug 2022
An Image is Worth One Word: Personalizing Text-to-Image Generation using
  Textual Inversion
An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion
Rinon Gal
Yuval Alaluf
Y. Atzmon
Or Patashnik
Amit H. Bermano
Gal Chechik
Daniel Cohen-Or
29
1,744
0
02 Aug 2022
AI Augmented Edge and Fog Computing: Trends and Challenges
AI Augmented Edge and Fog Computing: Trends and Challenges
Shreshth Tuli
Fatemeh Mirhakimi
Samodha Pallewatta
Syed Zawad
G. Casale
B. Javadi
Feng Yan
Rajkumar Buyya
N. Jennings
17
56
0
01 Aug 2022
Testing Relational Understanding in Text-Guided Image Generation
Testing Relational Understanding in Text-Guided Image Generation
C. Conwell
T. Ullman
EGVM
133
64
0
29 Jul 2022
Text-Guided Synthesis of Artistic Images with Retrieval-Augmented
  Diffusion Models
Text-Guided Synthesis of Artistic Images with Retrieval-Augmented Diffusion Models
Robin Rombach
A. Blattmann
Bjorn Ommer
DiffM
14
68
0
26 Jul 2022
What is Healthy? Generative Counterfactual Diffusion for Lesion
  Localization
What is Healthy? Generative Counterfactual Diffusion for Lesion Localization
Pedro Sanchez
Antanas Kascenas
Xiao Liu
Alison Q. OÑeil
Sotirios A. Tsaftaris
MedIm
DiffM
11
60
0
25 Jul 2022
Intention-Conditioned Long-Term Human Egocentric Action Forecasting
Intention-Conditioned Long-Term Human Egocentric Action Forecasting
Esteve Valls Mascaro
Hyemin Ahn
Dongheui Lee
EgoV
19
28
0
25 Jul 2022
Semantic Abstraction: Open-World 3D Scene Understanding from 2D
  Vision-Language Models
Semantic Abstraction: Open-World 3D Scene Understanding from 2D Vision-Language Models
Huy Ha
Shuran Song
LM&Ro
VLM
28
101
0
23 Jul 2022
Do Perceptually Aligned Gradients Imply Adversarial Robustness?
Do Perceptually Aligned Gradients Imply Adversarial Robustness?
Roy Ganz
Bahjat Kawar
Michael Elad
AAML
4
8
0
22 Jul 2022
Previous
123...9192939495
Next