Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.06125
Cited By
Hierarchical Text-Conditional Image Generation with CLIP Latents
13 April 2022
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
VLM
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Hierarchical Text-Conditional Image Generation with CLIP Latents"
50 / 4,735 papers shown
Title
Effective Cloud Removal for Remote Sensing Images by an Improved Mean-Reverting Denoising Model with Elucidated Design Space
Yi Liu
Wengen Li
Jihong Guan
S. Kevin Zhou
Yichao Zhang
DiffM
44
1
0
31 Mar 2025
FakeScope: Large Multimodal Expert Model for Transparent AI-Generated Image Forensics
Yixuan Li
Yu Tian
Yipo Huang
Wei Lu
Shiqi Wang
Weisi Lin
Anderson de Rezende Rocha
54
0
0
31 Mar 2025
TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes
Nikai Du
Zhennan Chen
Z. Chen
Shan Gao
Xi Chen
Zhengkai Jiang
Jian Yang
Ying Tai
DiffM
38
0
0
30 Mar 2025
Object Isolated Attention for Consistent Story Visualization
Xiangyang Luo
Junhao Cheng
Yifan Xie
Xin Zhang
Tao Feng
Z. Liu
Fei Ma
Fei Richard Yu
DiffM
39
1
0
30 Mar 2025
Evaluating Compositional Scene Understanding in Multimodal Generative Models
Shuhao Fu
Andrew Jun Lee
Anna Wang
Ida Momennejad
Trevor J. Bihl
Hongjing Lu
Taylor W. Webb
CoGe
OCL
96
1
0
29 Mar 2025
Scenario Dreamer: Vectorized Latent Diffusion for Generating Driving Simulation Environments
Luke Rowe
Roger Girgis
Anthony Gosselin
Liam Paull
C. Pal
Felix Heide
DiffM
VGen
33
1
0
28 Mar 2025
Concept-Aware LoRA for Domain-Aligned Segmentation Dataset Generation
Minho Park
S. Park
Jungsoo Lee
Hyojin Park
Kyuwoong Hwang
Fatih Porikli
Jaegul Choo
Sungha Choi
29
0
0
28 Mar 2025
Spatial Transport Optimization by Repositioning Attention Map for Training-Free Text-to-Image Synthesis
Woojung Han
Yeonkyung Lee
Chanyoung Kim
Kwanghyun Park
Seong Jae Hwang
DiffM
58
0
0
28 Mar 2025
Semantix: An Energy Guided Sampler for Semantic Style Transfer
Huiang He
Minghui Hu
C. Zheng
Chaoyue Wang
Tat-Jen Cham
DiffM
36
0
0
28 Mar 2025
Harnessing uncertainty when learning through Equilibrium Propagation in neural networks
Jonathan Peters
Philippe Talatchian
32
0
0
28 Mar 2025
SyncSDE: A Probabilistic Framework for Diffusion Synchronization
Hyunjun Lee
Hyunsoo Lee
Sookwan Han
DiffM
44
0
0
27 Mar 2025
Evaluating Text-to-Image Synthesis with a Conditional Fréchet Distance
Jaywon Koo
J. Hernandez
Moayed Haji-Ali
Ziyan Yang
Vicente Ordonez
EGVM
67
0
0
27 Mar 2025
Data Poisoning in Deep Learning: A Survey
Pinlong Zhao
Weiyao Zhu
Pengfei Jiao
Di Gao
Ou Wu
AAML
34
0
0
27 Mar 2025
A Unified Image-Dense Annotation Generation Model for Underwater Scenes
Hongkai Lin
Dingkang Liang
Zhenghao Qi
X. Bai
DiffM
36
0
0
27 Mar 2025
Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
Size Wu
W. Zhang
Lumin Xu
Sheng Jin
Zhonghua Wu
Qingyi Tao
Wentao Liu
Wei Li
Chen Change Loy
VGen
57
2
0
27 Mar 2025
Can Video Diffusion Model Reconstruct 4D Geometry?
Jinjie Mai
Wenxuan Zhu
Haozhe Liu
Bing Li
Cheng Zheng
Jürgen Schmidhuber
Bernard Ghanem
VGen
MDE
70
0
0
27 Mar 2025
AGILE: A Diffusion-Based Attention-Guided Image and Label Translation for Efficient Cross-Domain Plant Trait Identification
Earl Ranario
Lars Lundqvist
Heesup Yun
Brian N Bailey
J. M. Earles
VLM
35
0
0
27 Mar 2025
LOCATEdit: Graph Laplacian Optimized Cross Attention for Localized Text-Guided Image Editing
Achint Soni
Meet Soni
Sirisha Rambhatla
DiffM
49
0
0
27 Mar 2025
EditCLIP: Representation Learning for Image Editing
Qian Wang
Aleksandar Cvejic
Abdelrahman Eldesokey
Peter Wonka
59
0
0
26 Mar 2025
Forensic Self-Descriptions Are All You Need for Zero-Shot Detection, Open-Set Source Attribution, and Clustering of AI-generated Images
Tai D. Nguyen
Aref Azizpour
Matthew C. Stamm
46
1
0
26 Mar 2025
Contrastive Learning Guided Latent Diffusion Model for Image-to-Image Translation
Qi Si
Bo Wang
Zhao Zhang
60
0
0
26 Mar 2025
VPO: Aligning Text-to-Video Generation Models with Prompt Optimization
Jiale Cheng
Ruiliang Lyu
Xiaotao Gu
Xiao-Chang Liu
Jiazheng Xu
...
Zhuoyi Yang
Yuxiao Dong
Jie Tang
H. Wang
Minlie Huang
VGen
77
0
0
26 Mar 2025
Eyes Tell the Truth: GazeVal Highlights Shortcomings of Generative AI in Medical Imaging
David Wong
Bin Wang
Gorkem Durak
M. Tliba
Akshay S. Chaudhari
...
Eric Hart
Drew A Torigian
J. Udupa
Elizabeth A. Krupinski
Ulas Bagci
MedIm
26
0
0
26 Mar 2025
MMGen: Unified Multi-modal Image Generation and Understanding in One Go
Jiepeng Wang
Zhaoqing Wang
H. Pan
Yuan Liu
Dongdong Yu
Changhu Wang
Wenping Wang
DiffM
76
0
0
26 Mar 2025
Scaling Down Text Encoders of Text-to-Image Diffusion Models
Lifu Wang
Daqing Liu
Xinchen Liu
Xiaodong He
VLM
38
0
0
25 Mar 2025
ICE: Intrinsic Concept Extraction from a Single Image via Diffusion Models
Fernando Julio Cendra
Kai Han
VLM
49
0
0
25 Mar 2025
EfficientMT: Efficient Temporal Adaptation for Motion Transfer in Text-to-Video Diffusion Models
Yufei Cai
Hu Han
Yuxiang Wei
Shiguang Shan
Xilin Chen
DiffM
VGen
65
0
0
25 Mar 2025
IPGO: Indirect Prompt Gradient Optimization on Text-to-Image Generative Models with High Data Efficiency
Jianping Ye
Michel Wedel
Kunpeng Zhang
37
0
0
25 Mar 2025
ImageGen-CoT: Enhancing Text-to-Image In-context Learning with Chain-of-Thought Reasoning
Jiaqi Liao
Z. Yang
Linjie Li
Dianqi Li
Kevin Qinghong Lin
Yu-Xi Cheng
Lijuan Wang
MLLM
LRM
57
0
0
25 Mar 2025
Fine-Grained Erasure in Text-to-Image Diffusion-based Foundation Models
K. Thakral
Tamar Glaser
Tal Hassner
Mayank Vatsa
Richa Singh
33
2
0
25 Mar 2025
SITA: Structurally Imperceptible and Transferable Adversarial Attacks for Stylized Image Generation
Jingdan Kang
Haoxin Yang
Yan Cai
Huaidong Zhang
Xuemiao Xu
Yong Du
Shengfeng He
AAML
41
0
0
25 Mar 2025
Learning Hazing to Dehazing: Towards Realistic Haze Generation for Real-World Image Dehazing
Ruiyi Wang
Yushuo Zheng
Zicheng Zhang
Chunyi Li
Shuaicheng Liu
Guangtao Zhai
Xiaohong Liu
DiffM
49
0
0
25 Mar 2025
Scaling Vision Pre-Training to 4K Resolution
Baifeng Shi
Boyi Li
Han Cai
Y. Lu
Sifei Liu
...
Jan Kautz
Song Han
Trevor Darrell
Pavlo Molchanov
Hongxu Yin
CLIP
49
0
0
25 Mar 2025
Quantifying the Ease of Reproducing Training Data in Unconditional Diffusion Models
Masaya Hasegawa
Koji Yasuda
37
0
0
25 Mar 2025
FDS: Frequency-Aware Denoising Score for Text-Guided Latent Diffusion Image Editing
Yufan Ren
Zicong Jiang
Tong Zhang
Søren Forchhammer
Sabine Süsstrunk
DiffM
51
0
0
24 Mar 2025
Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models
Jinjin Zhang
Qiuyu Huang
Junjie Liu
Xiefan Guo
Di Huang
51
1
0
24 Mar 2025
Latent Space Super-Resolution for Higher-Resolution Image Generation with Diffusion Models
Jinho Jeong
Sangmin Han
Jinwoo Kim
Seon Joo Kim
34
0
0
24 Mar 2025
Latent Embedding Adaptation for Human Preference Alignment in Diffusion Planners
Wen Zheng Terence Ng
Jianda Chen
Yuan Xu
Tianwei Zhang
37
0
0
24 Mar 2025
DiffusedWrinkles: A Diffusion-Based Model for Data-Driven Garment Animation
R. Vidaurre
Elena Garces
Dan Casas
DiffM
AI4CE
79
1
0
24 Mar 2025
Training-free Diffusion Acceleration with Bottleneck Sampling
Ye Tian
Xin Xia
Yuxi Ren
Shanchuan Lin
Xing Wang
Xuefeng Xiao
Yunhai Tong
L. Yang
Bin Cui
56
0
0
24 Mar 2025
InPO: Inversion Preference Optimization with Reparametrized DDIM for Efficient Diffusion Model Alignment
Y. Lu
Qichao Wang
H. Cao
Xierui Wang
Xiaoyin Xu
Min Zhang
54
0
0
24 Mar 2025
Resource-Efficient Motion Control for Video Generation via Dynamic Mask Guidance
Sicong Feng
Jielong Yang
Li Peng
DiffM
VGen
49
0
0
24 Mar 2025
SimMotionEdit: Text-Based Human Motion Editing with Motion Similarity Prediction
Zhengyuan Li
Kai Cheng
Anindita Ghosh
Uttaran Bhattacharya
Liangyan Gui
Aniket Bera
DiffM
VGen
37
0
0
23 Mar 2025
OmnimatteZero: Training-free Real-time Omnimatte with Pre-trained Video Diffusion Models
Dvir Samuel
Matan Levy
N. Darshan
Gal Chechik
Rami Ben-Ari
DiffM
55
0
0
23 Mar 2025
Towards Transformer-Based Aligned Generation with Self-Coherence Guidance
Shulei Wang
Wang Lin
Hai Huang
Hanting Wang
Sihang Cai
...
Tao Jin
Jingyuan Chen
Jiacheng Sun
Jieming Zhu
Zhou Zhao
DiffM
55
2
0
22 Mar 2025
InstructVEdit: A Holistic Approach for Instructional Video Editing
Chi Zhang
C. Feng
Feng Yan
Qiming Zhang
Mingjin Zhang
Yujie Zhong
Jing Zhang
Lin Ma
DiffM
VGen
39
0
0
22 Mar 2025
ComfyGPT: A Self-Optimizing Multi-Agent System for Comprehensive ComfyUI Workflow Generation
Oucheng Huang
Yuhang Ma
Zeng Zhao
Mingrui Wu
Jiayi Ji
Rongsheng Zhang
Z. Hu
Xiaoshuai Sun
Rongrong Ji
38
0
0
22 Mar 2025
DynASyn: Multi-Subject Personalization Enabling Dynamic Action Synthesis
Yongjin Choi
Chanhun Park
Seung Jun Baek
DiffM
46
0
0
22 Mar 2025
DermDiff: Generative Diffusion Model for Mitigating Racial Biases in Dermatology Diagnosis
Nusrat Munia
Abdullah-Al-Zubaer Imran
MedIm
34
1
0
21 Mar 2025
Not Only Text: Exploring Compositionality of Visual Representations in Vision-Language Models
Davide Berasi
Matteo Farina
Massimiliano Mancini
Elisa Ricci
Nicola Strisciuglio
CoGe
63
0
0
21 Mar 2025
Previous
1
2
3
4
5
...
93
94
95
Next