Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

22 June 2022

ArXiv (abs)PDF HTML HuggingFace (4 upvotes)

Papers citing "Scaling Autoregressive Models for Content-Rich Text-to-Image Generation"

50 / 1,010 papers shown

Character-Aware Models Improve Visual Text RenderingAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Sharan Narang

262

20 Dec 2022

Benchmarking Spatial Relationships in Text-to-Image Generation

Yezhou Yang

361

20 Dec 2022

Scalable Diffusion Models with TransformersIEEE International Conference on Computer Vision (ICCV), 2022

William S. Peebles

Saining Xie

GNN

2.3K

4,295

19 Dec 2022

Point-E: A System for Generating 3D Point Clouds from Complex Prompts

353

769

16 Dec 2022

CLIPPO: Image-and-Language Understanding from Pixels OnlyComputer Vision and Pattern Recognition (CVPR), 2022

340

15 Dec 2022

Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image InpaintingComputer Vision and Pattern Recognition (CVPR), 2022

...

227

252

13 Dec 2022

Elixir: Train a Large Language Model on a Small GPU Cluster

Yang You

250

10 Dec 2022

MAGVIT: Masked Generative Video TransformerComputer Vision and Pattern Recognition (CVPR), 2022

...

Alexander G. Hauptmann

Ming-Hsuan Yang

294

333

10 Dec 2022

Training-Free Structured Diffusion Guidance for Compositional Text-to-Image SynthesisInternational Conference on Learning Representations (ICLR), 2022

587

382

09 Dec 2022

Multi-Concept Customization of Text-to-Image DiffusionComputer Vision and Pattern Recognition (CVPR), 2022

Jun-Yan Zhu

731

1,168

08 Dec 2022

Diffusion Guided Domain Adaptation of Image Generators

264

08 Dec 2022

SINE: SINgle Image Editing with Text-to-Image Diffusion ModelsComputer Vision and Pattern Recognition (CVPR), 2022

458

180

08 Dec 2022

X-Paste: Revisiting Scalable Copy-Paste for Instance Segmentation using CLIP and StableDiffusionInternational Conference on Machine Learning (ICML), 2022

Jianmin Bao

...

234

07 Dec 2022

Rethinking the Objectives of Vector-Quantized Tokenizers for Image SynthesisComputer Vision and Pattern Recognition (CVPR), 2022

Ying Shan

219

06 Dec 2022

Image Inpainting via Iteratively Decoupled Probabilistic ModelingInternational Conference on Learning Representations (ICLR), 2022

201

06 Dec 2022

M-VADER: A Model for Diffusion with Multimodal Context

Andres Felipe Cruz Salinas

DiffM

337

06 Dec 2022

3D-LDM: Neural Implicit 3D Shape Generation with Latent Diffusion Models

264

01 Dec 2022

CLIPascene: Scene Sketching with Different Types and Levels of AbstractionIEEE International Conference on Computer Vision (ICCV), 2022

Daniel Cohen-Or

248

30 Nov 2022

Fast Inference from Transformers via Speculative DecodingInternational Conference on Machine Learning (ICML), 2022

Yaniv Leviathan

Matan Kalman

Yossi Matias

LRM

634

1,151

30 Nov 2022

High-Fidelity Guided Image Synthesis with Latent Diffusion ModelsComputer Vision and Pattern Recognition (CVPR), 2022

Jaskirat Singh

Stephen Gould

Liang Zheng

DiffM

195

30 Nov 2022

Continuous diffusion for categorical data

...

334

144

28 Nov 2022

Unified Discrete Diffusion for Simultaneous Vision-Language GenerationInternational Conference on Learning Representations (ICLR), 2022

Zuopeng Yang

Ponnuthurai Nagaratnam Suganthan

DiffM

273

27 Nov 2022

SpaText: Spatio-Textual Representation for Controllable Image GenerationComputer Vision and Pattern Recognition (CVPR), 2022

Devi Parikh

278

247

25 Nov 2022

3DDesigner: Towards Photorealistic 3D Object Generation and Editing with Text-guided Diffusion Models

517

25 Nov 2022

Shifted Diffusion for Text-to-image GenerationComputer Vision and Pattern Recognition (CVPR), 2022

295

24 Nov 2022

Paint by Example: Exemplar-based Image Editing with Diffusion ModelsComputer Vision and Pattern Recognition (CVPR), 2022

297

546

23 Nov 2022

Peekaboo: Text to Image Diffusion Models are Zero-Shot Segmentors

290

23 Nov 2022

ReCo: Region-Controlled Text-to-Image GenerationComputer Vision and Pattern Recognition (CVPR), 2022

...

Zicheng Liu

274

188

23 Nov 2022

Inversion-Based Style Transfer with Diffusion ModelsComputer Vision and Pattern Recognition (CVPR), 2022

311

394

23 Nov 2022

Tell Me What Happened: Unifying Text-guided Video Completion via Multimodal Masked Video GenerationComputer Vision and Pattern Recognition (CVPR), 2022

412

23 Nov 2022

Retrieval-Augmented Multimodal Language ModelingInternational Conference on Machine Learning (ICML), 2023

Weijia Shi

Luke Zettlemoyer

252

132

22 Nov 2022

Human Evaluation of Text-to-Image Models on a Multi-Task Benchmark

...

117

22 Nov 2022

SceneComposer: Any-Level Semantic Image SynthesisComputer Vision and Pattern Recognition (CVPR), 2022

149

21 Nov 2022

VectorFusion: Text-to-SVG by Abstracting Pixel-Based Diffusion ModelsComputer Vision and Pattern Recognition (CVPR), 2022

Ajay Jain

Amber Xie

Pieter Abbeel

DiffM

214

118

21 Nov 2022

MagicVideo: Efficient Video Generation With Latent Diffusion Models

394

467

20 Nov 2022

Synthesizing Coherent Story with Auto-Regressive Latent Diffusion ModelsIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022

244

20 Nov 2022

Visual Programming: Compositional visual reasoning without trainingComputer Vision and Pattern Recognition (CVPR), 2022

Tanmay Gupta

Aniruddha Kembhavi

ReLM VLM LRM

436

571

18 Nov 2022

Is the Elephant Flying? Resolving Ambiguities in Text-to-Image Generative Models

200

17 Nov 2022

Will Large-scale Generative Models Corrupt Future Datasets?IEEE International Conference on Computer Vision (ICCV), 2022

Ryuichiro Hataya

Han Bao

Hiromi Arai

239

15 Nov 2022

Cross-Reality Re-Rendering: Manipulating between Digital and Physical Realities

Siddhartha Datta

241

15 Nov 2022

A Novel Sampling Scheme for Text- and Image-Conditional Image Synthesis in Quantized Latent Spaces

140

14 Nov 2022

Large-Scale Bidirectional Training for Zero-Shot Image Captioning

220

13 Nov 2022

SSGVS: Semantic Scene Graph-to-Video Synthesis

Yuren Cong

Jinhui Yi

Bodo Rosenhahn

M. Yang

242

11 Nov 2022

Rickrolling the Artist: Injecting Backdoors into Text Encoders for Text-to-Image SynthesisIEEE International Conference on Computer Vision (ICCV), 2022

461

04 Nov 2022

Evaluating a Synthetic Image Dataset Generated with Stable Diffusion

Andreas Stöckl

212

03 Nov 2022

eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers

...

618

983

02 Nov 2022

MagicMix: Semantic Mixing with Diffusion Models

369

28 Oct 2022

UPainting: Unified Text-to-Image Diffusion Generation with Cross-modal Guidance

...

504

28 Oct 2022

Deep Generative Models on 3D Representations: A Survey

319

27 Oct 2022

In-context Reinforcement Learning with Algorithm DistillationInternational Conference on Learning Representations (ICLR), 2022

Stephen Spencer

...

230

167

25 Oct 2022