ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.17002
  4. Cited By
Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following

Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following

28 November 2023
Yutong Feng
Biao Gong
Di Chen
Yujun Shen
Yu Liu
Jingren Zhou
    DiffM
ArXivPDFHTML

Papers citing "Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following"

38 / 38 papers shown
Title
Lay-Your-Scene: Natural Scene Layout Generation with Diffusion Transformers
Lay-Your-Scene: Natural Scene Layout Generation with Diffusion Transformers
Divyansh Srivastava
Xiang Zhang
He Wen
Chenru Wen
Zhuowen Tu
DiffM
26
0
0
07 May 2025
CountDiffusion: Text-to-Image Synthesis with Training-Free Counting-Guidance Diffusion
CountDiffusion: Text-to-Image Synthesis with Training-Free Counting-Guidance Diffusion
Y. Li
Pencheng Wan
Liang Han
Yaowei Wang
Liqiang Nie
Min Zhang
36
0
0
07 May 2025
Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal Interaction
Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal Interaction
Biao Gong
Cheng Zou
Dandan Zheng
Hu Yu
Jingdong Chen
...
Qingpei Guo
Rui Liu
Weilong Chai
Xinyu Xiao
Ziyuan Huang
MLLM
74
1
0
05 May 2025
CoherenDream: Boosting Holistic Text Coherence in 3D Generation via Multimodal Large Language Models Feedback
CoherenDream: Boosting Holistic Text Coherence in 3D Generation via Multimodal Large Language Models Feedback
Chenhan Jiang
Yihan Zeng
Hang Xu
Dit-Yan Yeung
44
0
0
28 Apr 2025
Instruction-augmented Multimodal Alignment for Image-Text and Element Matching
Instruction-augmented Multimodal Alignment for Image-Text and Element Matching
Xinli Yue
Jianhui Sun
Junda Lu
Liangchao Yao
Fan Xia
Tianyi Wang
Fengyun Rao
Jing Lyu
Yuetang Deng
21
0
0
16 Apr 2025
POEM: Precise Object-level Editing via MLLM control
POEM: Precise Object-level Editing via MLLM control
Marco Schouten
Mehmet Onurcan Kaya
Serge Belongie
Dim P. Papadopoulos
DiffM
68
0
0
10 Apr 2025
Multi-party Collaborative Attention Control for Image Customization
Multi-party Collaborative Attention Control for Image Customization
Han Yang
Chuanguang Yang
Qiuli Wang
Zhulin An
Weilun Feng
Libo Huang
Y. Xu
DiffM
23
0
0
02 Apr 2025
On Geometrical Properties of Text Token Embeddings for Strong Semantic Binding in Text-to-Image Generation
On Geometrical Properties of Text Token Embeddings for Strong Semantic Binding in Text-to-Image Generation
H. Seo
Junseo Bang
Haechang Lee
Joohoon Lee
Byung Hyun Lee
Se Young Chun
46
0
0
29 Mar 2025
Efficient Multi-Instance Generation with Janus-Pro-Dirven Prompt Parsing
Efficient Multi-Instance Generation with Janus-Pro-Dirven Prompt Parsing
Fan Qi
Yu Duan
Changsheng Xu
DiffM
45
0
0
27 Mar 2025
Towards Transformer-Based Aligned Generation with Self-Coherence Guidance
Towards Transformer-Based Aligned Generation with Self-Coherence Guidance
Shulei Wang
Wang Lin
Hai Huang
Hanting Wang
Sihang Cai
...
Tao Jin
Jingyuan Chen
Jiacheng Sun
Jieming Zhu
Zhou Zhao
DiffM
55
2
0
22 Mar 2025
DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models
DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models
Dewei Zhou
Mingwei Li
Zongxin Yang
Yi Yang
87
0
0
17 Mar 2025
Reflect-DiT: Inference-Time Scaling for Text-to-Image Diffusion Transformers via In-Context Reflection
Reflect-DiT: Inference-Time Scaling for Text-to-Image Diffusion Transformers via In-Context Reflection
Shufan Li
Konstantinos Kallidromitis
Akash Gokul
Arsh Koneru
Yusuke Kato
Kazuki Kozuka
Aditya Grover
VLM
56
1
0
15 Mar 2025
PlanGen: Towards Unified Layout Planning and Image Generation in Auto-Regressive Vision Language Models
PlanGen: Towards Unified Layout Planning and Image Generation in Auto-Regressive Vision Language Models
Runze He
Bo Cheng
Yuhang Ma
Qingxiang Jia
Shanyuan Liu
Ao Ma
Xiaoyu Wu
Liebucha Wu
Dawei Leng
Yuhui Yin
DiffM
VLM
47
0
0
13 Mar 2025
Token Merging for Training-Free Semantic Binding in Text-to-Image
  Synthesis
Token Merging for Training-Free Semantic Binding in Text-to-Image Synthesis
Taihang Hu
Linxuan Li
Joost van de Weijer
Hongcheng Gao
Fahad Shahbaz Khan
Jian Yang
Ming-Ming Cheng
Kai Wang
Yaxing Wang
DiffM
41
4
0
11 Nov 2024
Region-Aware Text-to-Image Generation via Hard Binding and Soft
  Refinement
Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement
Zhennan Chen
Yajie Li
Haofan Wang
Z. Chen
Zhengkai Jiang
Jun Yu Li
Qian Wang
Jian Yang
Ying Tai
DiffM
47
8
0
10 Nov 2024
How to Continually Adapt Text-to-Image Diffusion Models for Flexible
  Customization?
How to Continually Adapt Text-to-Image Diffusion Models for Flexible Customization?
Jiahua Dong
Wenqi Liang
Hongliu Li
Duzhen Zhang
Meng Cao
Henghui Ding
Salman Khan
F. Khan
DiffM
39
9
0
23 Oct 2024
3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image
  Generation
3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image Generation
Dewei Zhou
Ji Xie
Zongxin Yang
Yi Yang
DiffM
62
6
0
16 Oct 2024
Improving Long-Text Alignment for Text-to-Image Diffusion Models
Improving Long-Text Alignment for Text-to-Image Diffusion Models
Luping Liu
Chao Du
Tianyu Pang
Zehan Wang
Chongxuan Li
Dong Xu
VLM
48
5
0
15 Oct 2024
KnobGen: Controlling the Sophistication of Artwork in Sketch-Based Diffusion Models
KnobGen: Controlling the Sophistication of Artwork in Sketch-Based Diffusion Models
Pouyan Navard
Amin Karimi Monsefi
Mengxi Zhou
Wei-Lun Chao
Alper Yilmaz
R. Ramnath
DiffM
36
2
0
02 Oct 2024
Magnet: We Never Know How Text-to-Image Diffusion Models Work, Until We
  Learn How Vision-Language Models Function
Magnet: We Never Know How Text-to-Image Diffusion Models Work, Until We Learn How Vision-Language Models Function
Chenyi Zhuang
Ying Hu
Pan Gao
DiffM
VLM
33
12
0
30 Sep 2024
Conditional Image Synthesis with Diffusion Models: A Survey
Conditional Image Synthesis with Diffusion Models: A Survey
Zheyuan Zhan
Defang Chen
Jian-Ping Mei
Zhenghe Zhao
Jiawei Chen
Chun Chen
Siwei Lyu
Can Wang
VLM
35
4
0
28 Sep 2024
Emu3: Next-Token Prediction is All You Need
Emu3: Next-Token Prediction is All You Need
Xinlong Wang
Xiaosong Zhang
Zhengxiong Luo
Quan-Sen Sun
Yufeng Cui
...
Xi Yang
Jingjing Liu
Yonghua Lin
Tiejun Huang
Zhongyuan Wang
MLLM
31
147
0
27 Sep 2024
Draw Like an Artist: Complex Scene Generation with Diffusion Model via
  Composition, Painting, and Retouching
Draw Like an Artist: Complex Scene Generation with Diffusion Model via Composition, Painting, and Retouching
Minghao Liu
Le Zhang
Yingjie Tian
Xiaochao Qu
Luoqi Liu
Ting Liu
DiffM
CoGe
18
2
0
25 Aug 2024
Story3D-Agent: Exploring 3D Storytelling Visualization with Large
  Language Models
Story3D-Agent: Exploring 3D Storytelling Visualization with Large Language Models
Yuzhou Huang
Yiran Qin
Shunlin Lu
Xintao Wang
Rui Huang
Ying Shan
Ruimao Zhang
VGen
29
1
0
21 Aug 2024
VAR-CLIP: Text-to-Image Generator with Visual Auto-Regressive Modeling
VAR-CLIP: Text-to-Image Generator with Visual Auto-Regressive Modeling
Qian Zhang
Xiangzi Dai
Ninghua Yang
Xiang An
Ziyong Feng
Xingyu Ren
VLM
CLIP
35
17
0
02 Aug 2024
Specify and Edit: Overcoming Ambiguity in Text-Based Image Editing
Specify and Edit: Overcoming Ambiguity in Text-Based Image Editing
Ekaterina Iakovleva
Fabio Pizzati
Philip H. S. Torr
Stéphane Lathuiliere
DiffM
19
0
0
29 Jul 2024
ReCorD: Reasoning and Correcting Diffusion for HOI Generation
ReCorD: Reasoning and Correcting Diffusion for HOI Generation
Jian-Yu Jiang-Lin
Kang-Yang Huang
Ling Lo
Yi-Ning Huang
Terence Lin
Jhih-Ciang Wu
Hong-Han Shuai
Wen-Huang Cheng
DiffM
24
5
0
25 Jul 2024
ResMaster: Mastering High-Resolution Image Generation via Structural and
  Fine-Grained Guidance
ResMaster: Mastering High-Resolution Image Generation via Structural and Fine-Grained Guidance
Shuwei Shi
Wenbo Li
Yuechen Zhang
Jingwen He
Biao Gong
Yinqiang Zheng
43
10
0
24 Jun 2024
DA-HFNet: Progressive Fine-Grained Forgery Image Detection and
  Localization Based on Dual Attention
DA-HFNet: Progressive Fine-Grained Forgery Image Detection and Localization Based on Dual Attention
Yang Liu
Xiaofei Li
Jun Zhang
Shengze Hu
Jun Lei
64
4
0
03 Jun 2024
Compositional Text-to-Image Generation with Dense Blob Representations
Compositional Text-to-Image Generation with Dense Blob Representations
Weili Nie
Sifei Liu
Morteza Mardani
Chao Liu
Benjamin Eckart
Arash Vahdat
DiffM
70
16
0
14 May 2024
TheaterGen: Character Management with LLM for Consistent Multi-turn
  Image Generation
TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation
Junhao Cheng
Baiqiao Yin
Kaixin Cai
Minbin Huang
Hanhui Li
...
Yue Li
Yifei Li
Yuhao Cheng
Yiqiang Yan
Xiaodan Liang
DiffM
MLLM
26
12
0
29 Apr 2024
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
Xiwei Hu
Rui Wang
Yixiao Fang
Bin-Bin Fu
Pei Cheng
Gang Yu
VLM
57
39
0
08 Mar 2024
MuLan: Multimodal-LLM Agent for Progressive and Interactive Multi-Object
  Diffusion
MuLan: Multimodal-LLM Agent for Progressive and Interactive Multi-Object Diffusion
Sen Li
Ruochen Wang
Cho-Jui Hsieh
Minhao Cheng
Tianyi Zhou
MLLM
LM&Ro
27
3
0
20 Feb 2024
MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image
  Editing
MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing
Kai Zhang
Lingbo Mo
Wenhu Chen
Huan Sun
Yu-Chuan Su
EGVM
105
235
0
16 Jun 2023
A Survey on Segment Anything Model (SAM): Vision Foundation Model Meets
  Prompt Engineering
A Survey on Segment Anything Model (SAM): Vision Foundation Model Meets Prompt Engineering
Chaoning Zhang
Fachrina Dewi Puspitasari
Sheng Zheng
Chenghao Li
Yu Qiao
...
Caiyan Qin
François Rameau
Lik-Hang Lee
Sung-Ho Bae
Choong Seon Hong
VLM
76
61
0
12 May 2023
Pretraining is All You Need for Image-to-Image Translation
Pretraining is All You Need for Image-to-Image Translation
Tengfei Wang
Ting Zhang
Bo Zhang
Hao Ouyang
Dong Chen
Qifeng Chen
Fang Wen
DiffM
176
177
0
25 May 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
Zero-Shot Text-to-Image Generation
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,735
0
24 Feb 2021
1