Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.06125
Cited By
Hierarchical Text-Conditional Image Generation with CLIP Latents
13 April 2022
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
VLM
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Hierarchical Text-Conditional Image Generation with CLIP Latents"
50 / 4,735 papers shown
Title
Optimizing Multi-Round Enhanced Training in Diffusion Models for Improved Preference Understanding
Kun Li
J. Wang
Yangfan He
Xinyuan Song
Ruoyu Wang
...
K. Li
Sida Li
Miao Zhang
Tianyu Shi
Xueqian Wang
40
0
0
25 Apr 2025
Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models
Xu Ma
Peize Sun
Haoyu Ma
Hao Tang
Chih-Yao Ma
...
Matt Feiszli
Peizhao Zhang
Peter Vajda
Sam S. Tsai
Y. Fu
65
1
0
24 Apr 2025
Text-to-Image Alignment in Denoising-Based Models through Step Selection
P. Grimal
Hervé Le Borgne
Olivier Ferret
DiffM
EGVM
48
0
0
24 Apr 2025
DreamO: A Unified Framework for Image Customization
Chong Mou
Yanze Wu
Wenxu Wu
Zinan Guo
Pengze Zhang
...
Shaojin Wu
S. Zhao
Jian Andrew Zhang
Qian He
Xinglong Wu
44
0
0
23 Apr 2025
ePBR: Extended PBR Materials in Image Synthesis
Yu Guo
Zhiqiang Lao
Xiyun Song
Yubin Zhou
Zongfang Lin
Heather Yu
24
0
0
23 Apr 2025
FreeGraftor: Training-Free Cross-Image Feature Grafting for Subject-Driven Text-to-Image Generation
Zebin Yao
Lei Ren
Huixing Jiang
Chen Wei
Xiaojie Wang
Ruifan Li
Fangxiang Feng
DiffM
69
0
0
22 Apr 2025
MirrorVerse: Pushing Diffusion Models to Realistically Reflect the World
Ankit Dhiman
Manan Shah
R. V. Babu
19
0
0
21 Apr 2025
"I Know It When I See It": Mood Spaces for Connecting and Expressing Visual Concepts
Huzheng Yang
Katherine Xu
Michael D. Grossberg
Yutong Bai
Jianbo Shi
26
0
0
21 Apr 2025
Twin Co-Adaptive Dialogue for Progressive Image Generation
J. Wang
Yangfan He
Yan Zhong
Xinyuan Song
Jiayi Su
...
Miao Zhang
K. Li
Jiaqi Chen
Tianyu Shi
Xueqian Wang
19
0
0
21 Apr 2025
Acquire and then Adapt: Squeezing out Text-to-Image Model for Image Restoration
Junyuan Deng
Xinyi Wu
Yongxing Yang
Congchao Zhu
Song Wang
Zhenyao Wu
36
0
0
21 Apr 2025
Solving New Tasks by Adapting Internet Video Knowledge
Calvin Luo
Zilai Zeng
Yilun Du
Chen Sun
21
0
0
21 Apr 2025
Generative Multimodal Pretraining with Discrete Diffusion Timestep Tokens
Kaihang Pan
Wang Lin
Zhongqi Yue
Tenglong Ao
Liyu Jia
Wei Zhao
Juncheng Billy Li
Siliang Tang
Hanwang Zhang
35
1
0
20 Apr 2025
REDEditing: Relationship-Driven Precise Backdoor Poisoning on Text-to-Image Diffusion Models
Chongye Guo
Jinhu Fu
Junfeng Fang
Kun Wang
Guorui Feng
34
0
0
20 Apr 2025
DreamID: High-Fidelity and Fast diffusion-based Face Swapping via Triplet ID Group Learning
Fulong Ye
Miao Hua
Pengze Zhang
Xinghui Li
Qichao Sun
Songtao Zhao
Qian He
Xinglong Wu
56
0
0
20 Apr 2025
Towards NSFW-Free Text-to-Image Generation via Safety-Constraint Direct Preference Optimization
Shouwei Ruan
Zhenyu Wu
Yao Huang
Ruochen Zhang
Yitong Sun
Caixin Kang
Xingxing Wei
EGVM
31
0
0
19 Apr 2025
Learning Joint ID-Textual Representation for ID-Preserving Image Synthesis
Zichuan Liu
Liming Jiang
Qing Yan
Yumin Jia
Hao Kang
Xin Lu
DiffM
24
0
0
19 Apr 2025
Exploring Language Patterns of Prompts in Text-to-Image Generation and Their Impact on Visual Diversity
Maria-Teresa De Rosa Palmini
Eva Cetinic
31
0
0
19 Apr 2025
LLM-Enabled Style and Content Regularization for Personalized Text-to-Image Generation
Anran Yu
Wei Feng
Y. Zhang
Xiang Li
Lei Meng
Lei Wu
X. Meng
DiffM
22
0
0
19 Apr 2025
Teach Me How to Denoise: A Universal Framework for Denoising Multi-modal Recommender Systems via Guided Calibration
H. Li
Hanwen Du
Y. Li
Junchen Fu
Chunxiao Li
Ziyi Zhuang
Jiakang Li
Yongxin Ni
AI4TS
17
0
0
19 Apr 2025
Multi-modal Knowledge Graph Generation with Semantics-enriched Prompts
Yajing Xu
Zhiqiang Liu
Jiaoyan Chen
Mingchen Tu
Z. Chen
Jeff Z. Pan
Yichi Zhang
Yushan Zhu
Wen Zhang
H. Chen
29
0
0
18 Apr 2025
Point-Driven Interactive Text and Image Layer Editing Using Diffusion Models
Zhenyu Yu
Mohd Yamani Idna Idris
Pei Wang
Yuelong Xia
DiffM
19
0
0
18 Apr 2025
ESPLoRA: Enhanced Spatial Precision with Low-Rank Adaption in Text-to-Image Diffusion Models for High-Definition Synthesis
Andrea Rigo
Luca Stornaiuolo
Mauro Martino
Bruno Lepri
N. Sebe
41
0
0
18 Apr 2025
Set You Straight: Auto-Steering Denoising Trajectories to Sidestep Unwanted Concepts
Leyang Li
Shilin Lu
Yan Ren
A. Kong
DiffM
40
0
0
17 Apr 2025
Perception Encoder: The best visual embeddings are not at the output of the network
Daniel Bolya
Po-Yao (Bernie) Huang
Peize Sun
Jang Hyun Cho
Andrea Madotto
...
Shiyu Dong
Nikhila Ravi
Daniel Li
Piotr Dollár
Christoph Feichtenhofer
ObjD
VOS
103
0
0
17 Apr 2025
Image-Editing Specialists: An RLAIF Approach for Diffusion Models
Elior Benarous
Yilun Du
Heng Yang
17
0
0
17 Apr 2025
UniEdit-Flow: Unleashing Inversion and Editing in the Era of Flow Models
Guanlong Jiao
Biqing Huang
Kuan-Chieh Wang
Renjie Liao
DiffM
75
0
0
17 Apr 2025
Image Editing with Diffusion Models: A Survey
Jia Wang
Jie Hu
Xiaoqi Ma
Hanghang Ma
Xiaoming Wei
Enhua Wu
66
0
0
17 Apr 2025
Post-pre-training for Modality Alignment in Vision-Language Foundation Models
Shinýa Yamaguchi
Dewei Feng
Sekitoshi Kanai
Kazuki Adachi
Daiki Chijiwa
VLM
34
0
0
17 Apr 2025
ICAS: IP Adapter and ControlNet-based Attention Structure for Multi-Subject Style Transfer Optimization
Fuwei Liu
DiffM
31
0
0
17 Apr 2025
Securing the Skies: A Comprehensive Survey on Anti-UAV Methods, Benchmarking, and Future Directions
Yifei Dong
Fengyi Wu
Sanjian Zhang
Guangyu Chen
Yuzhi Hu
...
Jingdong Sun
Siyu Huang
Feng Liu
Qi Dai
Zhi-Qi Cheng
39
0
0
16 Apr 2025
The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation
Bingjie Gao
Xinyu Gao
Xiaoxue Wu
Yujie Zhou
Yu Qiao
Li Niu
Xinyuan Chen
Yaohui Wang
71
0
0
16 Apr 2025
PCDiff: Proactive Control for Ownership Protection in Diffusion Models with Watermark Compatibility
Keke Gai
Ziyue Shen
J. Yu
Liehuang Zhu
Qi Wu
WIGM
45
0
0
16 Apr 2025
Anti-Aesthetics: Protecting Facial Privacy against Customized Text-to-Image Synthesis
Songping Wang
Yueming Lyu
Shiqi Liu
Ning Li
Tong Tong
Hao Sun
Caifeng Shan
PICV
65
0
0
16 Apr 2025
DMM: Building a Versatile Image Generation Model via Distillation-Based Model Merging
Tianhui Song
Weixin Feng
Shuai Wang
X. Li
Tiezheng Ge
Bo Zheng
Limin Wang
MoMe
57
0
0
16 Apr 2025
Towards Forceful Robotic Foundation Models: a Literature Survey
William Xie
N. Correll
OffRL
56
0
0
16 Apr 2025
Instruction-augmented Multimodal Alignment for Image-Text and Element Matching
Xinli Yue
Jianhui Sun
Junda Lu
Liangchao Yao
Fan Xia
Tianyi Wang
Fengyun Rao
Jing Lyu
Yuetang Deng
21
0
0
16 Apr 2025
PT-Mark: Invisible Watermarking for Text-to-image Diffusion Models via Semantic-aware Pivotal Tuning
Y. Wang
Huiyu Xu
Zhibo Wang
Jiacheng Du
Z. Li
Yiming Li
Qiu Wang
Kui Ren
WIGM
47
0
0
15 Apr 2025
SimpleAR: Pushing the Frontier of Autoregressive Visual Generation through Pretraining, SFT, and RL
Junke Wang
Zhi Tian
X. Wang
Xinyu Zhang
Weilin Huang
Zuxuan Wu
Yu Jiang
VGen
43
2
0
15 Apr 2025
Omni
2
^2
2
: Unifying Omnidirectional Image Generation and Editing in an Omni Model
Liu Yang
Huiyu Duan
Yucheng Zhu
Xiaohong Liu
Lu Liu
Zitong Xu
Guangji Ma
Xiongkuo Min
Guangtao Zhai
P. Callet
VLM
VGen
49
0
0
15 Apr 2025
Token-Level Constraint Boundary Search for Jailbreaking Text-to-Image Models
J. Liu
Zhaoxin Wang
Handing Wang
Cong Tian
Yaochu Jin
21
0
0
15 Apr 2025
Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception
Ziqi Pang
Xin Xu
Yu-Xiong Wang
DiffM
57
0
0
15 Apr 2025
Omni-Dish: Photorealistic and Faithful Image Generation and Editing for Arbitrary Chinese Dishes
Huijie Liu
Bingcan Wang
Jie Hu
Xiaoming Wei
Guoliang Kang
61
0
0
14 Apr 2025
Efficient Generative Model Training via Embedded Representation Warmup
Deyuan Liu
Peng Sun
Xufeng Li
Tao Lin
19
0
0
14 Apr 2025
An Image is Worth
K
K
K
Topics: A Visual Structural Topic Model with Pretrained Image Embeddings
Matías Piqueras
Alexandra Segerberg
Matteo Magnani
Måns Magnusson
Nataša Sladoje
33
0
0
14 Apr 2025
Separate to Collaborate: Dual-Stream Diffusion Model for Coordinated Piano Hand Motion Synthesis
Zihao Liu
Mingwen Ou
Zunnan Xu
Jiaqi Huang
Haonan Han
Ronghui Li
X. Li
DiffM
28
0
0
14 Apr 2025
InstructEngine: Instruction-driven Text-to-Image Alignment
Xingyu Lu
Y. Hu
Y. Zhang
Kaiyu Jiang
Changyi Liu
...
Bin Wen
C. Yuan
Fan Yang
Tingting Gao
Di Zhang
31
0
0
14 Apr 2025
GeoUni: A Unified Model for Generating Geometry Diagrams, Problems and Problem Solutions
Jo-Ku Cheng
Zeren Zhang
Ran Chen
Jingyang Deng
Ziran Qin
Jinwen Ma
28
0
0
14 Apr 2025
SD-ReID: View-aware Stable Diffusion for Aerial-Ground Person Re-Identification
Xiang Hu
Pingping Zhang
Yuhao Wang
Bin Yan
Huchuan Lu
21
0
0
13 Apr 2025
D
2
^2
2
iT: Dynamic Diffusion Transformer for Accurate Image Generation
Weinan Jia
Mengqi Huang
Nan Chen
Lei Zhang
Zhendong Mao
21
0
0
13 Apr 2025
Automatic Detection of Intro and Credits in Video using CLIP and Multihead Attention
Vasilii Korolkov
Andrey Yanchenko
VLM
33
0
0
13 Apr 2025
Previous
1
2
3
4
5
...
93
94
95
Next