v1v2v3 (latest)

DynamicControl: Adaptive Condition Selection for Improved Text-to-Image Generation

4 December 2024

ArXiv (abs)PDF HTML Github

Papers citing "DynamicControl: Adaptive Condition Selection for Improved Text-to-Image Generation"

50 / 63 papers shown

Reasoning to Edit: Hypothetical Instruction-Based Image Editing with Visual Reasoning

278

02 Jul 2025

PixelPonder: Dynamic Patch Adaptation for Enhanced Multi-Conditional Text-to-Image Generation

...

415

09 Mar 2025

FlexEdit: Marrying Free-Shape Masks to VLLM for Flexible Image Editing

266

22 Aug 2024

AnyControl: Create Your Artwork with Versatile Control on Text-to-Image Generation

Yanchen Liu

Kai Chen

365

27 Jun 2024

SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation

Sijie Zhao

Ying Shan

518

295

22 Apr 2024

ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback

333

177

11 Apr 2024

MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis

370

135

08 Feb 2024

InstanceDiffusion: Instance-level Control for Image GenerationComputer Vision and Pattern Recognition (CVPR), 2024

486

202

05 Feb 2024

SmartEdit: Exploring Complex Instruction-based Image Editing with Multimodal Large Language Models

Xiaodong Cun

...

Ying Shan

259

166

11 Dec 2023

InstructAny2Pix: Flexible Visual Editing via Multimodal Instruction Following

309

11 Dec 2023

Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Bin Lin

1.8K

1,402

16 Nov 2023

MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning

Raghuraman Krishnamoorthi

1.8K

685

14 Oct 2023

Improved Baselines with Visual Instruction TuningComputer Vision and Pattern Recognition (CVPR), 2023

750

4,820

05 Oct 2023

Making LLaMA SEE and Draw with SEED TokenizerInternational Conference on Learning Representations (ICLR), 2023

Sijie Zhao

Ying Shan

263

202

02 Oct 2023

Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning

...

Luke Zettlemoyer

346

170

05 Sep 2023

IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models

449

1,487

13 Aug 2023

BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained DiffusionIEEE International Conference on Computer Vision (ICCV), 2023

858

313

20 Jul 2023

Planting a SEED of Vision in Large Language Model

Ying Shan

378

136

16 Jul 2023

SDXL: Improving Latent Diffusion Models for High-Resolution Image SynthesisInternational Conference on Learning Representations (ICLR), 2023

2.2K

4,427

04 Jul 2023

Controllable Text-to-Image Generation with GPT-4

Tianjun Zhang

440

29 May 2023

Generating Images with Multimodal Language ModelsNeural Information Processing Systems (NeurIPS), 2023

Jing Yu Koh

Daniel Fried

Ruslan Salakhutdinov

MLLM

477

365

26 May 2023

Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion ModelsNeural Information Processing Systems (NeurIPS), 2023

Jianmin Bao

Lu Yuan

492

435

25 May 2023

UniControl: A Unified Diffusion Model for Controllable Visual Generation In the WildNeural Information Processing Systems (NeurIPS), 2023

...

Silvio Savarese

Ran Xu

531

217

18 May 2023

MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language ModelsInternational Conference on Learning Representations (ICLR), 2023

606

3,021

20 Apr 2023

Visual Instruction TuningNeural Information Processing Systems (NeurIPS), 2023

1.4K

8,828

17 Apr 2023

MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and EditingIEEE International Conference on Computer Vision (ICCV), 2023

Ying Shan

316

763

17 Apr 2023

HumanSD: A Native Skeleton-Guided Diffusion Model for Human Image GenerationIEEE International Conference on Computer Vision (ICCV), 2023

Ailing Zeng

Lei Zhang

289

133

09 Apr 2023

Training-Free Layout Control with Cross-Attention GuidanceIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023

Minghao Chen

Iro Laina

Andrea Vedaldi

DiffM

553

350

06 Apr 2023

LLaMA: Open and Efficient Foundation Language Models

...

20.2K

19,316

27 Feb 2023

T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion ModelsAAAI Conference on Artificial Intelligence (AAAI), 2023

Ying Shan

655

1,603

16 Feb 2023

Adding Conditional Control to Text-to-Image Diffusion ModelsIEEE International Conference on Computer Vision (ICCV), 2023

1.2K

6,666

10 Feb 2023

BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language ModelsInternational Conference on Machine Learning (ICML), 2023

Silvio Savarese

1.6K

7,623

30 Jan 2023

GLIGEN: Open-Set Grounded Text-to-Image GenerationComputer Vision and Pattern Recognition (CVPR), 2023

Jianwei Yang

624

883

17 Jan 2023

ReCo: Region-Controlled Text-to-Image GenerationComputer Vision and Pattern Recognition (CVPR), 2022

...

Zicheng Liu

347

210

23 Nov 2022

InstructPix2Pix: Learning to Follow Image Editing InstructionsComputer Vision and Pattern Recognition (CVPR), 2022

Tim Brooks

Aleksander Holynski

Alexei A. Efros

DiffM

1.6K

2,834

17 Nov 2022

Imagic: Text-Based Real Image Editing with Diffusion ModelsComputer Vision and Pattern Recognition (CVPR), 2022

891

1,435

17 Oct 2022

LAION-5B: An open large-scale dataset for training next generation image-text modelsNeural Information Processing Systems (NeurIPS), 2022

...

1.5K

4,964

16 Oct 2022

DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven GenerationComputer Vision and Pattern Recognition (CVPR), 2022

Nataniel Ruiz

Yuanzhen Li

Varun Jampani

Yael Pritch

Michael Rubinstein

Kfir Aberman

1.5K

4,101

25 Aug 2022

An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual InversionInternational Conference on Learning Representations (ICLR), 2022

Daniel Cohen-Or

779

2,652

02 Aug 2022

Classifier-Free Diffusion Guidance

Jonathan Ho

Tim Salimans

FaML

710

5,964

26 Jul 2022

DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at Unprecedented ScaleInternational Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2022

Reza Yazdani Aminabadi

...

Yuxiong He

525

561

30 Jun 2022

Photorealistic Text-to-Image Diffusion Models with Deep Language UnderstandingNeural Information Processing Systems (NeurIPS), 2022

...

Raphael Gontijo-Lopes

David J Fleet

1.5K

8,076

23 May 2022

Hierarchical Text-Conditional Image Generation with CLIP Latents

1.5K

8,816

13 Apr 2022

High-Resolution Image Synthesis with Latent Diffusion ModelsComputer Vision and Pattern Recognition (CVPR), 2021

4.8K

23,580

20 Dec 2021

GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion ModelsInternational Conference on Machine Learning (ICML), 2021

1.4K

4,672

20 Dec 2021

LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs

995

1,808

03 Nov 2021

SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations

Jiajun Wu

Jun-Yan Zhu

Stefano Ermon

DiffM

856

2,070

02 Aug 2021

Variational Diffusion Models

1.1K

1,448

01 Jul 2021

LoRA: Low-Rank Adaptation of Large Language ModelsInternational Conference on Learning Representations (ICLR), 2021

OffRL AI4TS AI4CE ALM AIMat

1.9K

17,979

17 Jun 2021

Diffusion Models Beat GANs on Image SynthesisNeural Information Processing Systems (NeurIPS), 2021

Prafulla Dhariwal

Alex Nichol

3.9K

11,425

11 May 2021