ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.05871
  4. Cited By
OmniControlNet: Dual-stage Integration for Conditional Image Generation

OmniControlNet: Dual-stage Integration for Conditional Image Generation

9 June 2024
Yilin Wang
Haiyang Xu
Xiang Zhang
Zeyuan Chen
Zhizhou Sha
Zirui Wang
Zhuowen Tu
    VLM
ArXivPDFHTML

Papers citing "OmniControlNet: Dual-stage Integration for Conditional Image Generation"

23 / 23 papers shown
Title
T2VTextBench: A Human Evaluation Benchmark for Textual Control in Video Generation Models
T2VTextBench: A Human Evaluation Benchmark for Textual Control in Video Generation Models
Xuyang Guo
Jiayan Huo
Zhenmei Shi
Zhao-quan Song
Jiahao Zhang
Jiale Zhao
VGen
71
0
0
08 May 2025
Can You Count to Nine? A Human Evaluation Benchmark for Counting Limits in Modern Text-to-Video Models
Can You Count to Nine? A Human Evaluation Benchmark for Counting Limits in Modern Text-to-Video Models
Xuyang Guo
Zekai Huang
Jiayan Huo
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
Jiahao Zhang
ALM
VGen
65
2
0
05 Apr 2025
DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models
DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models
Dewei Zhou
Mingwei Li
Zongxin Yang
Yi Yang
87
0
0
17 Mar 2025
Theoretical Guarantees for High Order Trajectory Refinement in Generative Flows
Chengyue Gong
Xiaoyu Li
Yingyu Liang
Jiangxuan Long
Zhenmei Shi
Zhao-quan Song
Yu Tian
54
3
0
12 Mar 2025
HOFAR: High-Order Augmentation of Flow Autoregressive Transformers
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao-quan Song
Mingda Wan
75
1
0
11 Mar 2025
On Computational Limits of FlowAR Models: Expressivity and Efficiency
On Computational Limits of FlowAR Models: Expressivity and Efficiency
Chengyue Gong
Yekun Ke
Xiaoyu Li
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao-quan Song
61
3
0
23 Feb 2025
Looped ReLU MLPs May Be All You Need as Practical Programmable Computers
Looped ReLU MLPs May Be All You Need as Practical Programmable Computers
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao-quan Song
Yufa Zhou
89
18
0
21 Feb 2025
DPBloomfilter: Securing Bloom Filters with Differential Privacy
DPBloomfilter: Securing Bloom Filters with Differential Privacy
Yekun Ke
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao-quan Song
67
1
0
02 Feb 2025
Dissecting Submission Limit in Desk-Rejections: A Mathematical Analysis of Fairness in AI Conference Policies
Dissecting Submission Limit in Desk-Rejections: A Mathematical Analysis of Fairness in AI Conference Policies
Yuefan Cao
Xiaoyu Li
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao-quan Song
Jiahao Zhang
61
7
0
02 Feb 2025
Circuit Complexity Bounds for Visual Autoregressive Model
Circuit Complexity Bounds for Visual Autoregressive Model
Yekun Ke
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
45
5
0
08 Jan 2025
HSR-Enhanced Sparse Attention Acceleration
HSR-Enhanced Sparse Attention Acceleration
Bo Chen
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao-quan Song
84
18
0
14 Oct 2024
Differentially Private Attention Computation
Differentially Private Attention Computation
Yeqi Gao
Zhao-quan Song
Xin Yang
42
19
0
08 May 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
247
4,186
0
30 Jan 2023
Pretraining is All You Need for Image-to-Image Translation
Pretraining is All You Need for Image-to-Image Translation
Tengfei Wang
Ting Zhang
Bo Zhang
Hao Ouyang
Dong Chen
Qifeng Chen
Fang Wen
DiffM
184
177
0
25 May 2022
Palette: Image-to-Image Diffusion Models
Palette: Image-to-Image Diffusion Models
Chitwan Saharia
William Chan
Huiwen Chang
Chris A. Lee
Jonathan Ho
Tim Salimans
David J. Fleet
Mohammad Norouzi
DiffM
VLM
325
1,584
0
10 Nov 2021
Zero-Shot Text-to-Image Generation
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,735
0
24 Feb 2021
A Style-Based Generator Architecture for Generative Adversarial Networks
A Style-Based Generator Architecture for Generative Adversarial Networks
Tero Karras
S. Laine
Timo Aila
262
10,320
0
12 Dec 2018
Deep Ordinal Regression Network for Monocular Depth Estimation
Deep Ordinal Regression Network for Monocular Depth Estimation
Huan Fu
Mingming Gong
Chaohui Wang
Kayhan Batmanghelich
Dacheng Tao
MDE
180
1,687
0
06 Jun 2018
Feature Pyramid Networks for Object Detection
Feature Pyramid Networks for Object Detection
Tsung-Yi Lin
Piotr Dollár
Ross B. Girshick
Kaiming He
Bharath Hariharan
Serge J. Belongie
ObjD
166
21,643
0
09 Dec 2016
Semantic Understanding of Scenes through the ADE20K Dataset
Semantic Understanding of Scenes through the ADE20K Dataset
Bolei Zhou
Hang Zhao
Xavier Puig
Tete Xiao
Sanja Fidler
Adela Barriuso
Antonio Torralba
SSeg
249
1,821
0
18 Aug 2016
You Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object Detection
Joseph Redmon
S. Divvala
Ross B. Girshick
Ali Farhadi
ObjD
281
35,677
0
08 Jun 2015
U-Net: Convolutional Networks for Biomedical Image Segmentation
U-Net: Convolutional Networks for Biomedical Image Segmentation
Olaf Ronneberger
Philipp Fischer
Thomas Brox
SSeg
3DV
232
74,467
0
18 May 2015
Indoor Semantic Segmentation using depth information
Indoor Semantic Segmentation using depth information
Camille Couprie
C. Farabet
Laurent Najman
Yann LeCun
SSeg
MDE
70
473
0
16 Jan 2013
1