Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.05511
Cited By
Scaling up GANs for Text-to-Image Synthesis
9 March 2023
Minguk Kang
Jun-Yan Zhu
Richard Y. Zhang
Jaesik Park
Eli Shechtman
Sylvain Paris
Taesung Park
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Scaling up GANs for Text-to-Image Synthesis"
50 / 80 papers shown
Title
Continuous Visual Autoregressive Generation via Score Maximization
Chenze Shao
Fandong Meng
Jie Zhou
DiffM
21
0
0
12 May 2025
Multi-Modal Language Models as Text-to-Image Model Evaluators
Jiahui Chen
Candace Ross
Reyhane Askari Hemmat
Koustuv Sinha
Melissa Hall
M. Drozdzal
Adriana Romero-Soriano
EGVM
60
0
0
01 May 2025
PRISM: A Unified Framework for Photorealistic Reconstruction and Intrinsic Scene Modeling
Alara Dirik
Tuanfeng Y. Wang
Duygu Ceylan
Stefanos Zafeiriou
Anna Frühstück
DiffM
40
0
0
19 Apr 2025
Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
Size Wu
W. Zhang
Lumin Xu
Sheng Jin
Zhonghua Wu
Qingyi Tao
Wentao Liu
Wei Li
Chen Change Loy
VGen
91
2
0
27 Mar 2025
Ideas in Inference-time Scaling can Benefit Generative Pre-training Algorithms
Jiaming Song
Linqi Zhou
DiffM
59
0
0
10 Mar 2025
Fine-Grained Alignment and Noise Refinement for Compositional Text-to-Image Generation
Amir Mohammad Izadi
Seyed Mohsen Hosseini
Soroush Vafaie Tabar
Ali Abdollahi
Armin Saghafian
M. Baghshah
EGVM
40
0
0
09 Mar 2025
DynamicID: Zero-Shot Multi-ID Image Personalization with Flexible Facial Editability
Xirui Hu
Jiahao Wang
Hao Chen
Weizhan Zhang
Benqi Wang
Y. Li
Haishun Nan
DiffM
62
0
0
09 Mar 2025
Direct Discriminative Optimization: Your Likelihood-Based Visual Generative Model is Secretly a GAN Discriminator
Kaiwen Zheng
Yongxin Chen
Huayu Chen
Guande He
Ming-Yu Liu
J. Zhu
Qinsheng Zhang
DiffM
47
0
0
03 Mar 2025
MedUnifier: Unifying Vision-and-Language Pre-training on Medical Data with Vision Generation Task using Discrete Visual Representations
Ziyang Zhang
Yang Yu
Yucheng Chen
Xulei Yang
S. Yeo
MedIm
51
1
0
02 Mar 2025
Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation
Sucheng Ren
Qihang Yu
Ju He
Xiaohui Shen
Alan Yuille
Liang-Chieh Chen
VGen
76
6
0
27 Feb 2025
Data Attribution for Text-to-Image Models by Unlearning Synthesized Images
Sheng-Yu Wang
Aaron Hertzmann
Alexei A. Efros
Jun-Yan Zhu
Richard Zhang
TDI
123
2
0
21 Feb 2025
PDA: Generalizable Detection of AI-Generated Images via Post-hoc Distribution Alignment
Li Wang
Wenyu Chen
Zheng Li
Shanqing Guo
34
0
0
15 Feb 2025
Visual Generation Without Guidance
Huayu Chen
Kai Jiang
Kaiwen Zheng
Jianfei Chen
Hang Su
J. Zhu
55
0
0
28 Jan 2025
TexAVi: Generating Stereoscopic VR Video Clips from Text Descriptions
Vriksha Srihari
R. Bhavya
Shruti Jayaraman
V. Mary Anita Rajam
DiffM
VGen
25
0
0
02 Jan 2025
Taming Feed-forward Reconstruction Models as Latent Encoders for 3D Generative Models
Suttisak Wizadwongsa
Jinfan Zhou
Edward Li
Jeong Joon Park
3DV
55
0
0
31 Dec 2024
Next Patch Prediction for Autoregressive Visual Generation
Yatian Pang
Peng Jin
Shuo Yang
Bin Lin
Bin Zhu
...
Liuhan Chen
Francis E. H. Tay
Ser-Nam Lim
Harry Yang
Li Yuan
120
8
0
19 Dec 2024
Parallelized Autoregressive Visual Generation
Y. Wang
Shuhuai Ren
Zhijie Lin
Yujin Han
Haoyuan Guo
Zhenheng Yang
Difan Zou
Jiashi Feng
Xihui Liu
VGen
84
11
0
19 Dec 2024
Any-Resolution AI-Generated Image Detection by Spectral Learning
Dimitrios Karageorgiou
Symeon Papadopoulos
I. Kompatsiaris
Efstratios Gavves
101
0
0
28 Nov 2024
Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects
Weimin Qiu
Jieke Wang
Meng Tang
DiffM
79
0
0
28 Nov 2024
On Improved Conditioning Mechanisms and Pre-training Strategies for Diffusion Models
Tariq Berrada Ifriqi
Pietro Astolfi
Melissa Hall
Reyhane Askari Hemmat
Yohann Benchetrit
...
Matthew Muckley
Karteek Alahari
Adriana Romero Soriano
Jakob Verbeek
M. Drozdzal
AI4CE
VLM
52
2
0
05 Nov 2024
MovieCharacter: A Tuning-Free Framework for Controllable Character Video Synthesis
Di Qiu
Zheng Chen
Rui Wang
Mingyuan Fan
Changqian Yu
Junshi Huan
Xiang Wen
VGen
29
6
0
28 Oct 2024
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers
Enze Xie
Junsong Chen
Junyu Chen
Han Cai
Haotian Tang
...
Zhekai Zhang
Muyang Li
Ligeng Zhu
Y. Lu
Song Han
VLM
31
49
0
14 Oct 2024
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
Jinbin Bai
Tian-Chun Ye
Wei Chow
Enxin Song
Qing-Guo Chen
Xiangtai Li
Zhen Dong
Lei Zhu
50
13
0
10 Oct 2024
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
Sihyun Yu
Sangkyung Kwak
Huiwon Jang
Jongheon Jeong
Jonathan Huang
Jinwoo Shin
Saining Xie
OCL
68
62
0
09 Oct 2024
Beyond Imperfections: A Conditional Inpainting Approach for End-to-End Artifact Removal in VTON and Pose Transfer
Aref Tabatabaei
Zahra Dehghanian
M. Amirmazlaghani
DiffM
32
0
0
05 Oct 2024
DKDM: Data-Free Knowledge Distillation for Diffusion Models with Any Architecture
Qianlong Xiang
Miao Zhang
Yuzhang Shang
Jianlong Wu
Yan Yan
Liqiang Nie
DiffM
55
9
0
05 Sep 2024
FRAP: Faithful and Realistic Text-to-Image Generation with Adaptive Prompt Weighting
Liyao Jiang
Negar Hassanpour
Mohammad Salameh
Mohan Sai Singamsetti
Fengyu Sun
Wei Lu
Di Niu
DiffM
80
2
0
21 Aug 2024
Temporal Feature Matters: A Framework for Diffusion Model Quantization
Yushi Huang
Ruihao Gong
Xianglong Liu
Jing Liu
Yuhang Li
Jiwen Lu
Dacheng Tao
DiffM
MQ
49
0
0
28 Jul 2024
Stretching Each Dollar: Diffusion Training from Scratch on a Micro-Budget
Vikash Sehwag
Xianghao Kong
Jingtao Li
Michael Spranger
Lingjuan Lyu
DiffM
32
9
0
22 Jul 2024
DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation
Yuang Peng
Yuxin Cui
Haomiao Tang
Zekun Qi
Runpei Dong
Jing Bai
Chunrui Han
Zheng Ge
Xiangyu Zhang
Shu-Tao Xia
EGVM
72
31
0
24 Jun 2024
Not All Prompts Are Made Equal: Prompt-based Pruning of Text-to-Image Diffusion Models
Alireza Ganjdanesh
Reza Shirkavand
Shangqian Gao
Heng Huang
DiffM
VLM
43
4
0
17 Jun 2024
What If We Recaption Billions of Web Images with LLaMA-3?
Xianhang Li
Haoqin Tu
Mude Hui
Zeyu Wang
Bingchen Zhao
...
Jieru Mei
Qing Liu
Huangjie Zheng
Yuyin Zhou
Cihang Xie
VLM
MLLM
28
35
0
12 Jun 2024
PAL: Pluralistic Alignment Framework for Learning from Heterogeneous Preferences
Daiwei Chen
Yi Chen
Aniket Rege
Ramya Korlakai Vinayak
35
17
0
12 Jun 2024
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
Peize Sun
Yi Jiang
Shoufa Chen
Shilong Zhang
Bingyue Peng
Ping Luo
Zehuan Yuan
VLM
55
220
0
10 Jun 2024
PQPP: A Joint Benchmark for Text-to-Image Prompt and Query Performance Prediction
Eduard Poesina
Adriana Valentina Costache
Adrian-Gabriel Chifu
Josiane Mothe
Radu Tudor Ionescu
VLM
37
1
0
07 Jun 2024
RIGID: A Training-free and Model-Agnostic Framework for Robust AI-Generated Image Detection
Zhiyuan He
Pin-Yu Chen
Tsung-Yi Ho
31
12
0
30 May 2024
Does Diffusion Beat GAN in Image Super Resolution?
Denis Kuznedelev
Valerii Startsev
Daniil Shlenskii
Sergey Kastryulin
28
4
0
27 May 2024
ID-to-3D: Expressive ID-guided 3D Heads via Score Distillation Sampling
F. Babiloni
Alexandros Lattas
Jiankang Deng
S. Zafeiriou
DiffM
33
4
0
26 May 2024
TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation
Junhao Cheng
Baiqiao Yin
Kaixin Cai
Minbin Huang
Hanhui Li
...
Yue Li
Yifei Li
Yuhao Cheng
Yiqiang Yan
Xiaodan Liang
DiffM
MLLM
32
12
0
29 Apr 2024
F2FLDM: Latent Diffusion Models with Histopathology Pre-Trained Embeddings for Unpaired Frozen Section to FFPE Translation
M. M. Ho
Shikha Dubey
Yosep Chong
Beatrice S. Knudsen
Tolga Tasdizen
MedIm
AI4CE
29
1
0
19 Apr 2024
Inverse Neural Rendering for Explainable Multi-Object Tracking
Julian Ost
Tanushree Banerjee
Mario Bijelic
Felix Heide
25
0
0
18 Apr 2024
FreeSeg-Diff: Training-Free Open-Vocabulary Segmentation with Diffusion Models
Barbara Toniella Corradini
Mustafa Shukor
Paul Couairon
Guillaume Couairon
Franco Scarselli
Matthieu Cord
DiffM
VLM
38
4
0
29 Mar 2024
What Sketch Explainability Really Means for Downstream Tasks
Hmrishav Bandyopadhyay
Pinaki Nath Chowdhury
A. Bhunia
Aneeshan Sain
Tao Xiang
Yi-Zhe Song
30
4
0
14 Mar 2024
BootPIG: Bootstrapping Zero-shot Personalized Image Generation Capabilities in Pretrained Diffusion Models
Senthil Purushwalkam
Akash Gokul
Shafiq R. Joty
Nikhil Naik
DiffM
29
16
0
25 Jan 2024
Detecting Multimedia Generated by Large AI Models: A Survey
Li Lin
Neeraj Gupta
Yue Zhang
Hainan Ren
Chun-Hao Liu
Feng Ding
Xin Eric Wang
X. Li
Luisa Verdoliva
Shu Hu
75
53
0
22 Jan 2024
ZONE: Zero-Shot Instruction-Guided Local Editing
Shanglin Li
Bo-Wen Zeng
Yutang Feng
Sicheng Gao
Xuhui Liu
...
Li Lin
Xu Tang
Yao Hu
Jianzhuang Liu
Baochang Zhang
DiffM
18
30
0
28 Dec 2023
MEVG: Multi-event Video Generation with Text-to-Video Models
Gyeongrok Oh
Jaehwan Jeong
Sieun Kim
Wonmin Byeon
Jinkyu Kim
Sungwoong Kim
Sangpil Kim
VGen
DiffM
33
20
0
07 Dec 2023
Analyzing and Improving the Training Dynamics of Diffusion Models
Tero Karras
M. Aittala
J. Lehtinen
Janne Hellsten
Timo Aila
S. Laine
23
153
0
05 Dec 2023
SMaRt: Improving GANs with Score Matching Regularity
Mengfei Xia
Yujun Shen
Ceyuan Yang
Ran Yi
Wenping Wang
Yong-jin Liu
24
5
0
30 Nov 2023
Zooming Out on Zooming In: Advancing Super-Resolution for Remote Sensing
Piper Wolters
F. Bastani
Aniruddha Kembhavi
22
2
0
29 Nov 2023
1
2
Next