ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2303.10056
  4. Cited By
GlueGen: Plug and Play Multi-modal Encoders for X-to-image Generation

GlueGen: Plug and Play Multi-modal Encoders for X-to-image Generation

17 March 2023
Can Qin
Ning Yu
Chen Xing
Shu Zhen Zhang
Zeyuan Chen
Stefano Ermon
Yun Fu
Caiming Xiong
Ran Xu
    DiffM
ArXivPDFHTML

Papers citing "GlueGen: Plug and Play Multi-modal Encoders for X-to-image Generation"

10 / 10 papers shown
Title
ControlAR: Controllable Image Generation with Autoregressive Models
ControlAR: Controllable Image Generation with Autoregressive Models
Zongming Li
Tianheng Cheng
Shoufa Chen
Peize Sun
Haocheng Shen
Longjin Ran
Xiaoxin Chen
Wenyu Liu
Xinggang Wang
DiffM
132
14
0
03 Oct 2024
ClassDiffusion: More Aligned Personalization Tuning with Explicit Class Guidance
ClassDiffusion: More Aligned Personalization Tuning with Explicit Class Guidance
Jiannan Huang
Jun Hao Liew
Hanshu Yan
Yuyang Yin
Yao Zhao
Yunchao Wei
Yunchao Wei
DiffM
87
5
0
27 May 2024
MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal
  Image Generation
MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation
Marco Bellagente
Manuel Brack
H. Teufel
Felix Friedrich
Bjorn Deiseroth
...
Koen Oostermeijer
Andres Felipe Cruz Salinas
P. Schramowski
Kristian Kersting
Samuel Weinbach
33
15
0
24 May 2023
Diffusion Models in Vision: A Survey
Diffusion Models in Vision: A Survey
Florinel-Alin Croitoru
Vlad Hondru
Radu Tudor Ionescu
M. Shah
DiffM
VLM
MedIm
186
1,098
0
10 Sep 2022
On the Principles of Parsimony and Self-Consistency for the Emergence of
  Intelligence
On the Principles of Parsimony and Self-Consistency for the Emergence of Intelligence
Y. Ma
Doris Y. Tsao
H. Shum
54
75
0
11 Jul 2022
Crossmodal-3600: A Massively Multilingual Multimodal Evaluation Dataset
Crossmodal-3600: A Massively Multilingual Multimodal Evaluation Dataset
Ashish V. Thapliyal
Jordi Pont-Tuset
Xi Chen
Radu Soricut
VGen
67
71
0
25 May 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
382
4,010
0
28 Jan 2022
MLP-Mixer: An all-MLP Architecture for Vision
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
239
2,554
0
04 May 2021
Zero-Shot Text-to-Image Generation
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,735
0
24 Feb 2021
Domain-Adversarial Training of Neural Networks
Domain-Adversarial Training of Neural Networks
Yaroslav Ganin
E. Ustinova
Hana Ajakan
Pascal Germain
Hugo Larochelle
François Laviolette
M. Marchand
Victor Lempitsky
GAN
OOD
149
9,300
0
28 May 2015
1