Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.14598
Cited By
Vision + Language Applications: A Survey
24 May 2023
Yutong Zhou
N. Shimada
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Vision + Language Applications: A Survey"
19 / 19 papers shown
Title
K-Sort Arena: Efficient and Reliable Benchmarking for Generative Models via K-wise Human Preferences
Zhikai Li
Xuewen Liu
Dongrong Fu
Jianquan Li
Qingyi Gu
Kurt Keutzer
Zhen Dong
EGVM
VGen
DiffM
72
1
0
26 Aug 2024
One Transformer Fits All Distributions in Multi-Modal Diffusion at Scale
Fan Bao
Shen Nie
Kaiwen Xue
Chongxuan Li
Shiliang Pu
Yaole Wang
Gang Yue
Yue Cao
Hang Su
Jun Zhu
DiffM
188
147
0
12 Mar 2023
Muse: Text-To-Image Generation via Masked Generative Transformers
Huiwen Chang
Han Zhang
Jarred Barber
AJ Maschinot
José Lezama
...
Kevin Patrick Murphy
William T. Freeman
Michael Rubinstein
Yuanzhen Li
Dilip Krishnan
DiffM
197
515
0
02 Jan 2023
Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models
Jiale Xu
Xintao Wang
Weihao Cheng
Yan-Pei Cao
Ying Shan
Xiaohu Qie
Shenghua Gao
169
161
0
28 Dec 2022
UPainting: Unified Text-to-Image Diffusion Generation with Cross-modal Guidance
Wei Li
Xue Xu
Xinyan Xiao
Jiacheng Liu
Hu Yang
...
Zhanpeng Wang
Zhifan Feng
Qiaoqiao She
Yajuan Lyu
Hua-Hong Wu
110
29
0
28 Oct 2022
DALL-E-Bot: Introducing Web-Scale Diffusion Models to Robotics
Ivan Kapelyukh
Vitalis Vosylius
Edward Johns
LM&Ro
DiffM
93
143
0
05 Oct 2022
clip2latent: Text driven sampling of a pre-trained StyleGAN using denoising diffusion and CLIP
Justin N. M. Pinkney
Chuan Li
CLIP
VLM
37
19
0
05 Oct 2022
Human Motion Diffusion Model
Guy Tevet
Sigal Raab
Brian Gordon
Yonatan Shafir
Daniel Cohen-Or
Amit H. Bermano
DiffM
VGen
177
713
0
29 Sep 2022
Creative Painting with Latent Diffusion Models
Xianchao Wu
DiffM
AI4CE
36
11
0
29 Sep 2022
Text-Free Learning of a Natural Language Interface for Pretrained Face Generators
Xiaodan Du
Raymond A. Yeh
Nicholas I. Kolkin
Eli Shechtman
Gregory Shakhnarovich
CLIP
13
1
0
08 Sep 2022
Text2Human: Text-Driven Controllable Human Image Generation
Yuming Jiang
Shuai Yang
Haonan Qiu
Wayne Wu
Chen Change Loy
Ziwei Liu
DiffM
101
45
0
31 May 2022
CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers
Wenyi Hong
Ming Ding
Wendi Zheng
Xinghan Liu
Jie Tang
DiffM
235
556
0
29 May 2022
Autoregressive Image Generation using Residual Quantization
Doyup Lee
Chiheon Kim
Saehoon Kim
Minsu Cho
Wook-Shin Han
VGen
168
324
0
03 Mar 2022
CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP
Zihao W. Wang
Wei Liu
Qian He
Xin-ru Wu
Zili Yi
CLIP
VLM
177
71
0
01 Mar 2022
Talk-to-Edit: Fine-Grained Facial Editing via Dialog
Yuming Jiang
Ziqi Huang
Xingang Pan
Chen Change Loy
Ziwei Liu
DiffM
104
125
0
09 Sep 2021
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
322
2,249
0
02 Sep 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,735
0
24 Feb 2021
TryOnGAN: Body-Aware Try-On via Layered Interpolation
Kathleen M. Lewis
Srivatsan Varadharajan
Ira Kemelmacher-Shlizerman
3DH
94
48
0
06 Jan 2021
A Style-Based Generator Architecture for Generative Adversarial Networks
Tero Karras
S. Laine
Timo Aila
262
10,183
0
12 Dec 2018
1