ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.06125
  4. Cited By
Hierarchical Text-Conditional Image Generation with CLIP Latents

Hierarchical Text-Conditional Image Generation with CLIP Latents

13 April 2022
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
    VLM
    DiffM
ArXivPDFHTML

Papers citing "Hierarchical Text-Conditional Image Generation with CLIP Latents"

35 / 4,735 papers shown
Title
Diagnosing and Fixing Manifold Overfitting in Deep Generative Models
Diagnosing and Fixing Manifold Overfitting in Deep Generative Models
G. Loaiza-Ganem
Brendan Leigh Ross
Jesse C. Cresswell
Anthony L. Caterini
GAN
DRL
6
27
0
14 Apr 2022
Synthesizing Adversarial Visual Scenarios for Model-Based Robotic
  Control
Synthesizing Adversarial Visual Scenarios for Model-Based Robotic Control
Shubhankar Agarwal
Sandeep P. Chinchali
AAML
17
4
0
13 Apr 2022
Contrastive language and vision learning of general fashion concepts
Contrastive language and vision learning of general fashion concepts
P. Chia
Giuseppe Attanasio
Federico Bianchi
Silvia Terragni
A. Magalhães
Diogo Gonçalves
C. Greco
Jacopo Tagliabue
CLIP
13
42
0
08 Apr 2022
KNN-Diffusion: Image Generation via Large-Scale Retrieval
KNN-Diffusion: Image Generation via Large-Scale Retrieval
Shelly Sheynin
Oron Ashual
Adam Polyak
Uriel Singer
Oran Gafni
Eliya Nachmani
Yaniv Taigman
VLM
SyDa
DiffM
11
111
0
06 Apr 2022
CLIP-Mesh: Generating textured meshes from text using pretrained
  image-text models
CLIP-Mesh: Generating textured meshes from text using pretrained image-text models
N. Khalid
Tianhao Xie
Eugene Belilovsky
Tiberiu Popa
CLIP
6
291
0
24 Mar 2022
Complex Scene Image Editing by Scene Graph Comprehension
Complex Scene Image Editing by Scene Graph Comprehension
Zhongping Zhang
Huiwen He
Bryan A. Plummer
Z. Liao
Huayan Wang
DiffM
17
6
0
24 Mar 2022
How well does CLIP understand texture?
How well does CLIP understand texture?
Chenyun Wu
Subhransu Maji
8
6
0
22 Mar 2022
Diffusion Probabilistic Modeling for Video Generation
Diffusion Probabilistic Modeling for Video Generation
Ruihan Yang
Prakhar Srivastava
Stephan Mandt
DiffM
VGen
13
255
0
16 Mar 2022
The Role of ImageNet Classes in Fréchet Inception Distance
The Role of ImageNet Classes in Fréchet Inception Distance
Tuomas Kynkaanniemi
Tero Karras
M. Aittala
Timo Aila
J. Lehtinen
EGVM
VLM
8
197
0
11 Mar 2022
KPE: Keypoint Pose Encoding for Transformer-based Image Generation
KPE: Keypoint Pose Encoding for Transformer-based Image Generation
Soon Yau Cheong
A. Mustafa
Andrew Gilbert
ViT
17
10
0
09 Mar 2022
Joint rotational invariance and adversarial training of a dual-stream
  Transformer yields state of the art Brain-Score for Area V4
Joint rotational invariance and adversarial training of a dual-stream Transformer yields state of the art Brain-Score for Area V4
William Berrios
Arturo Deza
MedIm
ViT
12
13
0
08 Mar 2022
A Typology for Exploring the Mitigation of Shortcut Behavior
A Typology for Exploring the Mitigation of Shortcut Behavior
Felix Friedrich
Wolfgang Stammer
P. Schramowski
Kristian Kersting
LLMAG
12
9
0
04 Mar 2022
CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP
CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP
Zihao W. Wang
Wei Liu
Qian He
Xin-ru Wu
Zili Yi
CLIP
VLM
177
71
0
01 Mar 2022
One-shot Ultra-high-Resolution Generative Adversarial Network That
  Synthesizes 16K Images On A Single GPU
One-shot Ultra-high-Resolution Generative Adversarial Network That Synthesizes 16K Images On A Single GPU
Junseok Oh
Donghwee Yoon
Injung Kim
11
1
0
28 Feb 2022
DALL-Eval: Probing the Reasoning Skills and Social Biases of
  Text-to-Image Generation Models
DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generation Models
Jaemin Cho
Abhaysinh Zala
Mohit Bansal
ViT
132
167
0
08 Feb 2022
When Do Flat Minima Optimizers Work?
When Do Flat Minima Optimizers Work?
Jean Kaddour
Linqing Liu
Ricardo M. A. Silva
Matt J. Kusner
ODL
6
58
0
01 Feb 2022
FIGARO: Generating Symbolic Music with Fine-Grained Artistic Control
FIGARO: Generating Symbolic Music with Fine-Grained Artistic Control
Dimitri von Rutte
Luca Biggio
Yannic Kilcher
Thomas Hofmann
17
0
0
26 Jan 2022
Multimodal Image Synthesis and Editing: The Generative AI Era
Multimodal Image Synthesis and Editing: The Generative AI Era
Fangneng Zhan
Yingchen Yu
Rongliang Wu
Jiahui Zhang
Shijian Lu
Lingjie Liu
Adam Kortylewski
Christian Theobalt
Eric Xing
EGVM
19
48
0
27 Dec 2021
Quasi-Taylor Samplers for Diffusion Generative Models based on Ideal
  Derivatives
Quasi-Taylor Samplers for Diffusion Generative Models based on Ideal Derivatives
Hideyuki Tachibana
Mocho Go
Muneyoshi Inahara
Yotaro Katayama
Yotaro Watanabe
DiffM
14
3
0
26 Dec 2021
CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP
CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP
Andreas Fürst
Elisabeth Rumetshofer
Johannes Lehner
Viet-Hung Tran
Fei Tang
...
David P. Kreil
Michael K Kopp
G. Klambauer
Angela Bitto-Nemling
Sepp Hochreiter
VLM
CLIP
193
101
0
21 Oct 2021
Pre-trained Language Models in Biomedical Domain: A Systematic Survey
Pre-trained Language Models in Biomedical Domain: A Systematic Survey
Benyou Wang
Qianqian Xie
Jiahuan Pei
Zhihong Chen
Prayag Tiwari
Zhao Li
Jie Fu
LM&MA
AI4CE
20
160
0
11 Oct 2021
An Explainable-AI approach for Diagnosis of COVID-19 using MALDI-ToF
  Mass Spectrometry
An Explainable-AI approach for Diagnosis of COVID-19 using MALDI-ToF Mass Spectrometry
V. Seethi
Z. LaCasse
P. Chivte
Joshua Bland
Shrihari S. Kadkol
E. Gaillard
Pratool Bharti
Hamed Alhoori
16
9
0
28 Sep 2021
How much human-like visual experience do current self-supervised
  learning algorithms need in order to achieve human-level object recognition?
How much human-like visual experience do current self-supervised learning algorithms need in order to achieve human-level object recognition?
Emin Orhan
OOD
22
4
0
23 Sep 2021
How Much Can CLIP Benefit Vision-and-Language Tasks?
How Much Can CLIP Benefit Vision-and-Language Tasks?
Sheng Shen
Liunian Harold Li
Hao Tan
Mohit Bansal
Anna Rohrbach
Kai-Wei Chang
Z. Yao
Kurt Keutzer
CLIP
VLM
MLLM
182
403
0
13 Jul 2021
Systematic human learning and generalization from a brief tutorial with
  explanatory feedback
Systematic human learning and generalization from a brief tutorial with explanatory feedback
A. Nam
James L. McClelland
11
0
0
10 Jul 2021
Visual Probing: Cognitive Framework for Explaining Self-Supervised Image
  Representations
Visual Probing: Cognitive Framework for Explaining Self-Supervised Image Representations
Witold Oleszkiewicz
Dominika Basaj
Igor Sieradzki
Michal Górszczak
Barbara Rychalska
K. Lewandowska
Tomasz Trzciñski
Bartosz Zieliñski
SSL
29
3
0
21 Jun 2021
Efficient Deep Learning: A Survey on Making Deep Learning Models
  Smaller, Faster, and Better
Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better
Gaurav Menghani
VLM
MedIm
14
359
0
16 Jun 2021
Communicating Natural Programs to Humans and Machines
Communicating Natural Programs to Humans and Machines
Samuel Acquaviva
Yewen Pu
Marta Kryven
Theo Sechopoulos
Catherine Wong
Gabrielle Ecanow
Maxwell Nye
Michael Henry Tessler
J. Tenenbaum
17
40
0
15 Jun 2021
Neural Monge Map estimation and its applications
Neural Monge Map estimation and its applications
JiaoJiao Fan
Shu Liu
Shaojun Ma
Haomin Zhou
Yongxin Chen
OT
17
17
0
07 Jun 2021
Creativity and Machine Learning: A Survey
Creativity and Machine Learning: A Survey
Giorgio Franceschelli
Mirco Musolesi
VLM
AI4CE
19
37
0
06 Apr 2021
Structure Inducing Pre-Training
Structure Inducing Pre-Training
Matthew B. A. McDermott
Brendan Yap
Peter Szolovits
Marinka Zitnik
30
17
0
18 Mar 2021
Zero-Shot Text-to-Image Generation
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,735
0
24 Feb 2021
A Survey on Visual Transformer
A Survey on Visual Transformer
Kai Han
Yunhe Wang
Hanting Chen
Xinghao Chen
Jianyuan Guo
...
Chunjing Xu
Yixing Xu
Zhaohui Yang
Yiman Zhang
Dacheng Tao
ViT
11
2,096
0
23 Dec 2020
RainNet: A Large-Scale Imagery Dataset and Benchmark for Spatial
  Precipitation Downscaling
RainNet: A Large-Scale Imagery Dataset and Benchmark for Spatial Precipitation Downscaling
Xuanhong Chen
Kairui Feng
Naiyuan Liu
Bingbing Ni
Yifan Lu
Zhengyan Tong
Ziang Liu
11
6
0
17 Dec 2020
Model-Based Deep Learning
Model-Based Deep Learning
Nir Shlezinger
Jay Whang
Yonina C. Eldar
A. Dimakis
6
310
0
15 Dec 2020
Previous
123...939495