Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

22 June 2022

ArXiv (abs)PDF HTML HuggingFace (4 upvotes)

Papers citing "Scaling Autoregressive Models for Content-Rich Text-to-Image Generation"

50 / 1,010 papers shown

Noise Diffusion for Enhancing Semantic Faithfulness in Text-to-Image SynthesisComputer Vision and Pattern Recognition (CVPR), 2024

486

25 Nov 2024

SAR3D: Autoregressive 3D Object Generation and Understanding via Multi-scale 3D VQVAEComputer Vision and Pattern Recognition (CVPR), 2024

652

25 Nov 2024

Visual-Word Tokenizer: Beyond Fixed Sets of Tokens in Vision Transformers

681

23 Nov 2024

TPIE: Topology-Preserved Image Editing With Text Instructions

Nivetha Jayakumar

Srivardhan Reddy Gadila

442

22 Nov 2024

Text Embedding is Not All You Need: Attention Control for Text-to-Image Semantic Alignment with Text Self-Attention MapsComputer Vision and Pattern Recognition (CVPR), 2024

296

21 Nov 2024

Is this Generated Person Existed in Real-world? Fine-grained Detecting and Calibrating Abnormal Human-bodyComputer Vision and Pattern Recognition (CVPR), 2024

246

21 Nov 2024

How to Defend Against Large-scale Model Poisoning Attacks in Federated Learning: A Vertical Solution

248

16 Nov 2024

GSEditPro: 3D Gaussian Splatting Editing with Attention-based Progressive Localization

179

15 Nov 2024

Artificial Intelligence for Biomedical Video Generation

402

12 Nov 2024

ENAT: Rethinking Spatial-temporal Interactions in Token-based Image SynthesisNeural Information Processing Systems (NeurIPS), 2024

273

11 Nov 2024

Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement

Zhennan Chen

358

10 Nov 2024

Hardware-Friendly Diffusion Models with Fixed-Size Reusable Structures for On-Device Image Generation

Sanchar Palit

Sathya Veera Reddy Dendi

Mallikarjuna Talluri

Raj Narayana Gadde

247

09 Nov 2024

Autoregressive Models in Vision: A Survey

...

494

08 Nov 2024

Clustering in Causal Attention MaskingNeural Information Processing Systems (NeurIPS), 2024

Nikita Karagodin

Yury Polyanskiy

Philippe Rigollet

318

07 Nov 2024

AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and GenerationNeural Information Processing Systems (NeurIPS), 2024

247

07 Nov 2024

DomainGallery: Few-shot Domain-driven Image Generation by Attribute-centric FinetuningNeural Information Processing Systems (NeurIPS), 2024

234

07 Nov 2024

Image Understanding Makes for A Good Tokenizer for Image GenerationNeural Information Processing Systems (NeurIPS), 2024

203

07 Nov 2024

ReEdit: Multimodal Exemplar-Based Image Editing with Diffusion ModelsIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024

225

06 Nov 2024

SEE-DPO: Self Entropy Enhanced Direct Preference Optimization

Shivanshu Shekhar

Shreyas Singh

Tong Zhang

270

06 Nov 2024

Addressing Representation Collapse in Vector Quantized Models with One Linear Layer

525

04 Nov 2024

Randomized Autoregressive Visual Generation

Ju He

326

01 Nov 2024

Towards Unifying Understanding and Generation in the Era of Vision Foundation Models: A Survey from the Autoregression Perspective

Guoqi Li

449

29 Oct 2024

Benchmarking Human and Automated Prompting in the Segment Anything ModelBigData Congress [Services Society] (BSS), 2024

225

29 Oct 2024

MotionGPT-2: A General-Purpose Motion-Language Model for Motion Generation and Understanding

245

29 Oct 2024

Adapting Diffusion Models for Improved Prompt Compliance and Controllable Image SynthesisNeural Information Processing Systems (NeurIPS), 2024

241

29 Oct 2024

NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks

Yongchang Hao

Yanshuai Cao

Lili Mou

225

28 Oct 2024

Simpler Diffusion (SiD2): 1.5 FID on ImageNet512 with pixel-space diffusion

1.0K

25 Oct 2024

FairQueue: Rethinking Prompt Learning for Fair Text-to-Image GenerationNeural Information Processing Systems (NeurIPS), 2024

Christopher T. H. Teo

281

24 Oct 2024

Robust Watermarking Using Generative Priors Against Image Editing: From Benchmarking to AdvancesInternational Conference on Learning Representations (ICLR), 2024

564

24 Oct 2024

Fast constrained sampling in pre-trained diffusion models

398

24 Oct 2024

How to Continually Adapt Text-to-Image Diffusion Models for Flexible Customization?Neural Information Processing Systems (NeurIPS), 2024

Meng Cao

232

23 Oct 2024

DiP-GO: A Diffusion Pruner via Few-step Gradient OptimizationNeural Information Processing Systems (NeurIPS), 2024

Ji Liu

...

168

22 Oct 2024

Elucidating the design space of language models for image generation

199

21 Oct 2024

Opportunities and Challenges of Generative-AI in FinanceBigData Congress [Services Society] (BSS), 2024

445

21 Oct 2024

BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation CapabilitiesInternational Conference on Learning Representations (ICLR), 2024

Shaozhe Hao

Xuantong Liu

546

18 Oct 2024

Assessing Open-world Forgetting in Generative Image Model Customization

Héctor Laria

Alex Gomez-Villa

Imad Eddine Marouf

Bogdan Raducanu

VLM DiffM

301

18 Oct 2024

Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous TokensInternational Conference on Learning Representations (ICLR), 2024

Yuanzhen Li

Michael Rubinstein

330

110

17 Oct 2024

MagicTailor: Component-Controllable Personalization in Text-to-Image Diffusion ModelsInternational Joint Conference on Artificial Intelligence (IJCAI), 2024

439

17 Oct 2024

Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World Model DisentanglementNeural Information Processing Systems (NeurIPS), 2024

261

15 Oct 2024

Customize Your Visual Autoregressive Recipe with Set Autoregressive Modeling

Xiangyu Yue

226

14 Oct 2024

Evaluating Semantic Variation in Text-to-Image Synthesis: A Causal PerspectiveInternational Conference on Learning Representations (ICLR), 2024

Zhixu Li

1.1K

14 Oct 2024

Generating Intermediate Representations for Compositional Text-To-Image Generation

Ran Galun

Sagie Benaim

217

13 Oct 2024

Toward Guidance-Free AR Visual Generation via Condition Contrastive AlignmentInternational Conference on Learning Representations (ICLR), 2024

219

12 Oct 2024

DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation

Josh Susskind

377

10 Oct 2024

Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis

525

10 Oct 2024

EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language ModelsNeural Information Processing Systems (NeurIPS), 2024

Rui Zhao

...

Xiang Wang

Zhangjie Wu

Junhao Zhang

Yingya Zhang

Mike Zheng Shou

DiffM VLM

328

09 Oct 2024

Rectified Diffusion: Straightness Is Not Your Need in Rectified FlowInternational Conference on Learning Representations (ICLR), 2024

Fu-Yun Wang

Ling Yang

Zhaoyang Huang

Mengdi Wang

Hongsheng Li

249

09 Oct 2024

Decouple-Then-Merge: Finetune Diffusion Models as Multi-Task LearningComputer Vision and Pattern Recognition (CVPR), 2024

Dongrui Liu

Linfeng Zhang

301

09 Oct 2024

Diversity-Rewarded CFG DistillationInternational Conference on Learning Representations (ICLR), 2024

245

08 Oct 2024

Learning on LoRAs: GL-Equivariant Processing of Low-Rank Weight Spaces for Large Finetuned Models

Theo Putterman

Derek Lim

277

05 Oct 2024