Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

22 June 2022

ArXiv (abs)PDF HTML HuggingFace (4 upvotes)

Papers citing "Scaling Autoregressive Models for Content-Rich Text-to-Image Generation"

50 / 1,010 papers shown

Lafite2: Few-shot Text-to-Image Generation

192

25 Oct 2022

Vitruvio: 3D Building Meshes via Single Perspective Sketches

265

24 Oct 2022

Instance-Aware Image Completion

191

22 Oct 2022

SpaBERT: A Pretrained Language Model from Geographic Data for Geo-Entity RepresentationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

258

21 Oct 2022

3DALL-E: Integrating Text-to-Image AI in 3D Design Workflows

277

155

20 Oct 2022

Composing Ensembles of Pre-trained Models via Iterative ConsensusInternational Conference on Learning Representations (ICLR), 2022

Shuang Li

Antonio Torralba

160

20 Oct 2022

Transcending Scaling Laws with 0.1% Extra ComputeConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

...

312

20 Oct 2022

OCR-VQGAN: Taming Text-within-Image GenerationIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022

David Vazquez

266

19 Oct 2022

Optimizing Hierarchical Image VAEs for Sample Quality

Eric Luhman

Troy Luhman

DRL

173

18 Oct 2022

Large-scale Text-to-Image Generation Models for Visual Artists' Creative WorksInternational Conference on Intelligent User Interfaces (IUI), 2022

Jinwook Seo

477

189

16 Oct 2022

LAION-5B: An open large-scale dataset for training next generation image-text modelsNeural Information Processing Systems (NeurIPS), 2022

...

890

4,531

16 Oct 2022

DE-FAKE: Detection and Attribution of Fake Images Generated by Text-to-Image Generation ModelsConference on Computer and Communications Security (CCS), 2022

Zheng Li

213

199

13 Oct 2022

Underspecification in Scene Description-to-Depiction Tasks

Ben Hutchinson

Jason Baldridge

Vinodkumar Prabhakaran

DiffM

218

11 Oct 2022

Markup-to-Image Diffusion Models with Scheduled SamplingInternational Conference on Learning Representations (ICLR), 2022

186

11 Oct 2022

Can Artificial Intelligence Reconstruct Ancient Mosaics?Studies in Conservation (SIC), 2022

Fernando Moral-Andrés

Elena Merino-Gómez

Pedro Reviriego

Fabrizio Lombardi

07 Oct 2022

On Distillation of Guided Diffusion ModelsComputer Vision and Pattern Recognition (CVPR), 2022

Ruiqi Gao

249

697

06 Oct 2022

A New Path: Scaling Vision-and-Language Navigation with Synthetic Instructions and Imitation LearningComputer Vision and Pattern Recognition (CVPR), 2022

415

06 Oct 2022

Phenaki: Variable Length Video Generation From Open Domain Textual DescriptionInternational Conference on Learning Representations (ICLR), 2022

Ruben Villegas

Mohammad Babaeizadeh

Pieter-Jan Kindermans

362

486

05 Oct 2022

Imagen Video: High Definition Video Generation with Diffusion Models

Ruiqi Gao

...

David J. Fleet

441

1,862

05 Oct 2022

Progressive Text-to-Image Generation

Zhengcong Fei

Mingyuan Fan

Li Zhu

Junshi Huang

301

05 Oct 2022

Visual Prompt Tuning for Generative Transfer LearningComputer Vision and Pattern Recognition (CVPR), 2022

324

105

03 Oct 2022

Membership Inference Attacks Against Text-to-image Generation Models

Yixin Wu

Ning Yu

Zheng Li

Michael Backes

Yang Zhang

DiffM

197

03 Oct 2022

AudioGen: Textually Guided Audio GenerationInternational Conference on Learning Representations (ICLR), 2022

Devi Parikh

Yossi Adi

410

392

30 Sep 2022

Understanding Pure CLIP Guidance for Voxel Grid NeRF Models

Han-Hung Lee

Angel X. Chang

148

30 Sep 2022

DreamFusion: Text-to-3D using 2D DiffusionInternational Conference on Learning Representations (ICLR), 2022

879

3,151

29 Sep 2022

Make-A-Video: Text-to-Video Generation without Text-Video DataInternational Conference on Learning Representations (ICLR), 2022

...

Devi Parikh

298

1,795

29 Sep 2022

Re-Imagen: Retrieval-Augmented Text-to-Image GeneratorInternational Conference on Learning Representations (ICLR), 2022

568

230

29 Sep 2022

Learning to Learn with Generative Models of Neural Network Checkpoints

272

26 Sep 2022

All are Worth Words: A ViT Backbone for Diffusion ModelsComputer Vision and Pattern Recognition (CVPR), 2022

Hang Su

Jun Zhu

VLM

553

499

25 Sep 2022

Extremely Simple Activation Shaping for Out-of-Distribution DetectionInternational Conference on Learning Representations (ICLR), 2022

Andrija Djurisic

412

201

20 Sep 2022

Exploiting Cultural Biases via Homoglyphs in Text-to-Image SynthesisJournal of Artificial Intelligence Research (JAIR), 2022

389

19 Sep 2022

Does CLIP Know My Face?Journal of Artificial Intelligence Research (JAIR), 2022

261

15 Sep 2022

AudioLM: a Language Modeling Approach to Audio GenerationIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022

Olivier Pietquin

...

397

819

07 Sep 2022

DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven GenerationComputer Vision and Pattern Recognition (CVPR), 2022

Nataniel Ruiz

Yuanzhen Li

Varun Jampani

Yael Pritch

Michael Rubinstein

Kfir Aberman

1.0K

3,756

25 Aug 2022

Text to Image Generation: Leaving no Language Behind

Pedro Reviriego

Elena Merino-Gómez

VLM

131

19 Aug 2022

Finding Reusable Machine Learning Components to Build Programming Language Processing PipelinesEuropean Conference on Software Architecture (ECSA), 2022

210

11 Aug 2022

Quality Not Quantity: On the Interaction between Dataset Design and Robustness of CLIPNeural Information Processing Systems (NeurIPS), 2022

Thao Nguyen

569

122

10 Aug 2022

Adversarial Attacks on Image Generation With Made-Up Words

Raphael Milliere

228

04 Aug 2022

DALLE-URBAN: Capturing the urban design expertise of large text to image transformers

Sachith Seneviratne

Damith A. Senanayake

Sanka Rasnayaka

Rajith Vidanaarachchi

Jason Thompson

ViT

258

03 Aug 2022

Prompt-to-Prompt Image Editing with Cross Attention ControlInternational Conference on Learning Representations (ICLR), 2022

Amir Hertz

Ron Mokady

J. Tenenbaum

Kfir Aberman

Yael Pritch

Daniel Cohen-Or

DiffM

719

2,333

02 Aug 2022

An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual InversionInternational Conference on Learning Representations (ICLR), 2022

Daniel Cohen-Or

498

2,443

02 Aug 2022

Lighting (In)consistency of Paint by Text

Hany Farid

160

27 Jul 2022

Text-Guided Synthesis of Artistic Images with Retrieval-Augmented Diffusion Models

260

26 Jul 2022

NUWA-Infinity: Autoregressive over Autoregressive Generation for Infinite Visual SynthesisNeural Information Processing Systems (NeurIPS), 2022

Jian Liang

Zicheng Liu

223

20 Jul 2022

Perspective (In)consistency of Paint by Text

Hany Farid

DiffM

202

27 Jun 2022

Worldwide AI Ethics: a review of 200 guidelines and recommendations for AI governancePatterns (Patterns), 2022

...

Nythamar Fernandes de Oliveira

450

181

23 Jun 2022

Unified-IO: A Unified Model for Vision, Language, and Multi-Modal TasksInternational Conference on Learning Representations (ICLR), 2022

469

473

17 Jun 2022

Write and Paint: Generative Vision-Language Models are Unified Modal LearnersInternational Conference on Learning Representations (ICLR), 2022

294

15 Jun 2022

Blended Latent DiffusionACM Transactions on Graphics (TOG), 2022

373

490

06 Jun 2022

Parallel Synthesis for Autoregressive Speech GenerationIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022

270

25 Apr 2022