Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

5 March 2024

ArXiv (abs)PDF HTML HuggingFace (68 upvotes)

Papers citing "Scaling Rectified Flow Transformers for High-Resolution Image Synthesis"

50 / 1,247 papers shown

I-Max: Maximize the Resolution Potential of Pre-trained Rectified Flow Transformers with Projected Flow

Hongsheng Li

Zhanyu Ma

Peng Gao

295

10 Oct 2024

Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis

521

10 Oct 2024

MotionAura: Generating High-Quality and Motion Consistent Videos using Discrete DiffusionInternational Conference on Learning Representations (ICLR), 2024

354

10 Oct 2024

IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image GenerationInternational Conference on Learning Representations (ICLR), 2024

Xinchen Zhang

Ling Yang

Mengdi Wang

Bin Cui

EGVM CoGe

332

09 Oct 2024

Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You ThinkInternational Conference on Learning Representations (ICLR), 2024

712

292

09 Oct 2024

F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow MatchingAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

613

269

09 Oct 2024

Pyramidal Flow Matching for Efficient Video Generative ModelingInternational Conference on Learning Representations (ICLR), 2024

Kun Xu

...

Yang Song

517

202

08 Oct 2024

Active Fine-Tuning of Multi-Task Policies

543

07 Oct 2024

Image Watermarks are Removable Using Controllable Regeneration from Clean NoiseInternational Conference on Learning Representations (ICLR), 2024

Yepeng Liu

Yiren Song

Hai Ci

Yu Zhang

Haofan Wang

Mike Zheng Shou

Yuheng Bu

WIGM

330

07 Oct 2024

A Reflection on the Impact of Misspecifying Unidentifiable Causal Inference Models in Surrogate Endpoint Evaluation

Gokce Deliorman

Florian Stijven

Wim Van der Elst

Maria del Carmen Pardo

Ariel Alonso

CML

236

06 Oct 2024

Is What You Ask For What You Get? Investigating Concept Associations in Text-to-Image Models

470

06 Oct 2024

Elucidating the Design Choice of Probability Paths in Flow Matching for Forecasting

495

04 Oct 2024

Stochastic Sampling from Deterministic Flow Models

Saurabh Singh

Ian S. Fischer

251

03 Oct 2024

Channel-aware Contrastive Conditional Diffusion for Multivariate Probabilistic Time Series Forecasting

Siyang Li

Yize Chen

Hui Xiong

DiffM AI4TS

246

03 Oct 2024

Eliminating Oversaturation and Artifacts of High Guidance Scales in Diffusion ModelsInternational Conference on Learning Representations (ICLR), 2024

402

03 Oct 2024

Local Flow Matching Generative Models

Chen Xu

Xiuyuan Cheng

Yao Xie

371

03 Oct 2024

Selective Attention Improves TransformerInternational Conference on Learning Representations (ICLR), 2024

Yaniv Leviathan

Matan Kalman

Yossi Matias

349

03 Oct 2024

ControlAR: Controllable Image Generation with Autoregressive ModelsInternational Conference on Learning Representations (ICLR), 2024

Xiaoxin Chen

Wenyu Liu

Xinggang Wang

DiffM

675

03 Oct 2024

Denoising with a Joint-Embedding Predictive ArchitectureInternational Conference on Learning Representations (ICLR), 2024

Dengsheng Chen

Jie Hu

Xiaoming Wei

Enhua Wu

DiffM

481

02 Oct 2024

Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi DecodingInternational Conference on Learning Representations (ICLR), 2024

Yu Wang

Zhenguo Li

Xihui Liu

380

02 Oct 2024

KnobGen: Controlling the Sophistication of Artwork in Sketch-Based Diffusion Models

Wei-Lun Chao

356

02 Oct 2024

Multimodal Pragmatic Jailbreak on Text-to-image ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

313

27 Sep 2024

FlowTurbo: Towards Real-time Flow-Based Image Generation with Velocity RefinerNeural Information Processing Systems (NeurIPS), 2024

186

26 Sep 2024

JoyType: A Robust Design for Multilingual Visual Text Creation

Chao Li

Chen Jiang

Xiaolong Liu

Jun Zhao

Guoxin Wang

DiffM

350

26 Sep 2024

FlowMAC: Conditional Flow Matching for Audio Coding at Low Bit RatesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

276

26 Sep 2024

Multi-modal Generative AI: Multi-modal LLMs, Diffusions, and the Unification

491

23 Sep 2024

PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language InstructionsInternational Conference on Learning Representations (ICLR), 2024

Weifeng Lin

Xinyu Wei

Renrui Zhang

Le Zhuo

Shitian Zhao

...

Junlin Xie

Yu Qiao

Peng Gao

Hongsheng Li

MLLM DiffM

566

23 Sep 2024

Imagine yourself: Tuning-Free Personalized Image Generation

Felix Juefei-Xu

...

Ning Zhang

218

20 Sep 2024

AudioComposer: Towards Fine-grained Audio Generation with Natural Language DescriptionsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

378

19 Sep 2024

Understanding Implosion in Text-to-Image Generative ModelsConference on Computer and Communications Security (CCS), 2024

Wenxin Ding

Cathy Y. Li

Shawn Shan

Ben Y. Zhao

Haitao Zheng

343

18 Sep 2024

Finding the Subjective Truth: Collecting 2 Million Votes for Comprehensive Gen-AI Model Evaluation

Dimitrios Christodoulou

Mads Kuhlmann-Jørgensen

EGVM

181

18 Sep 2024

Automatic Scene Generation: State-of-the-Art Techniques, Models, Datasets, Challenges, and Future ProspectsIEEE Access (IEEE Access), 2024

273

14 Sep 2024

Seed-Music: A Unified Framework for High Quality and Controlled Music Generation

Ye Bai

Haonan Chen

Jitong Chen

Zhuo Chen

...

Shicen Zhou

314

13 Sep 2024

Token Turing Machines are Efficient Vision ModelsIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024

Purvish Jajal

Nick Eliopoulos

Benjamin Shiue-Hal Chou

George K. Thiravathukal

James C. Davis

Yung-Hsiang Lu

374

11 Sep 2024

Alignment of Diffusion Models: Fundamentals, Challenges, and Future

462

11 Sep 2024

Differentially Private Kernel Density Estimation

Erzhi Liu

Jerry Yao-Chieh Hu

Alex Reneau

Zhao Song

Han Liu

458

03 Sep 2024

Affordance-based Robot Manipulation with Flow Matching

Fan Zhang

Michael Gienger

692

02 Sep 2024

Law of Vision Representation in MLLMs

577

29 Aug 2024

Hand1000: Generating Realistic Hands from Text with Only 1,000 ImagesAAAI Conference on Artificial Intelligence (AAAI), 2024

Haozhuo Zhang

368

28 Aug 2024

MagicMan: Generative Novel View Synthesis of Humans with 3D-Aware Diffusion and Iterative Refinement

Xu He

Xiaoyu Li

Di Kang

Liyang Chen

Han Zhang

Haolin Zhuang

353

26 Aug 2024

Explainable Concept Generation through Vision-Language Preference Learning for Understanding Neural Networks' Internal Representations

419

24 Aug 2024

Taming Text-to-Image Synthesis for Novices: User-centric Prompt Generation via Multi-turn Guidance

...

301

23 Aug 2024

MUSES: 3D-Controllable Image Generation via Multi-Modal Agent CollaborationAAAI Conference on Artificial Intelligence (AAAI), 2024

Yu Qiao

Yali Wang

VGen

315

20 Aug 2024

An End-to-End Model for Photo-Sharing Multi-modal Dialogue Generation

Meishan Zhang

338

16 Aug 2024

Hybrid SD: Edge-Cloud Collaborative Inference for Stable Diffusion Models

Fangming Chen

339

13 Aug 2024

Music2Latent: Consistency Autoencoders for Latent Audio CompressionInternational Society for Music Information Retrieval Conference (ISMIR), 2024

Marco Pasini

Stefan Lattner

George Fazekas

229

12 Aug 2024

CogVideoX: Text-to-Video Diffusion Models with An Expert TransformerInternational Conference on Learning Representations (ICLR), 2024

Zhuoyi Yang

Wendi Zheng

...

Xiaotao Gu

Yuxiao Dong

Jie Tang

DiffM VGen

860

1,293

12 Aug 2024

IPAdapter-Instruct: Resolving Ambiguity in Image-based Conditioning using Instruct Prompts

283

06 Aug 2024

Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining

Shitian Zhao

Xinyue Li

Qi Qin

Yu Qiao

Hongsheng Li

Peng Gao

MLLM

414

111

05 Aug 2024

Bailing-TTS: Chinese Dialectal Speech Synthesis Towards Human-like Spontaneous Representation

275

01 Aug 2024