ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2101.04702
  4. Cited By
Cross-Modal Contrastive Learning for Text-to-Image Generation

Cross-Modal Contrastive Learning for Text-to-Image Generation

12 January 2021
Han Zhang
Jing Yu Koh
Jason Baldridge
Honglak Lee
Yinfei Yang
    GAN
ArXivPDFHTML

Papers citing "Cross-Modal Contrastive Learning for Text-to-Image Generation"

50 / 74 papers shown
Title
Learning Graph Representation of Agent Diffusers
Learning Graph Representation of Agent Diffusers
Youcef Djenouri
Nassim Belmecheri
Tomasz Michalak
Jan Dubiñski
Ahmed Nabil Belbachir
Anis Yazidi
AI4CE
26
0
0
10 May 2025
Hadamard product in deep learning: Introduction, Advances and Challenges
Hadamard product in deep learning: Introduction, Advances and Challenges
Grigorios G. Chrysos
Yongtao Wu
Razvan Pascanu
Philip Torr
V. Cevher
AAML
96
0
0
17 Apr 2025
Fine-Grained Alignment and Noise Refinement for Compositional Text-to-Image Generation
Amir Mohammad Izadi
Seyed Mohsen Hosseini
Soroush Vafaie Tabar
Ali Abdollahi
Armin Saghafian
M. Baghshah
EGVM
40
0
0
09 Mar 2025
MedUnifier: Unifying Vision-and-Language Pre-training on Medical Data with Vision Generation Task using Discrete Visual Representations
MedUnifier: Unifying Vision-and-Language Pre-training on Medical Data with Vision Generation Task using Discrete Visual Representations
Ziyang Zhang
Yang Yu
Yucheng Chen
Xulei Yang
S. Yeo
MedIm
51
1
0
02 Mar 2025
SOEDiff: Efficient Distillation for Small Object Editing
SOEDiff: Efficient Distillation for Small Object Editing
Yiming Wu
Qihe Pan
Zhen Zhao
Zicheng Wang
Sifan Long
Ronghua Liang
DiffM
60
0
0
03 Jan 2025
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
Sihyun Yu
Sangkyung Kwak
Huiwon Jang
Jongheon Jeong
Jonathan Huang
Jinwoo Shin
Saining Xie
OCL
68
62
0
09 Oct 2024
Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction
Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction
Jing He
Haodong Li
Wei Yin
Yixun Liang
Leheng Li
Kaiqiang Zhou
Hongbo Zhang
Bingbing Liu
Ying-Cong Chen
DiffM
VLM
44
40
0
26 Sep 2024
MIGC++: Advanced Multi-Instance Generation Controller for Image Synthesis
MIGC++: Advanced Multi-Instance Generation Controller for Image Synthesis
Dewei Zhou
Y. Li
Fan Ma
Zongxin Yang
Y. Yang
91
11
0
02 Jul 2024
Ensembling Diffusion Models via Adaptive Feature Aggregation
Ensembling Diffusion Models via Adaptive Feature Aggregation
Cong Wang
Kuan Tian
Yonghang Guan
Jun Zhang
Zhiwei Jiang
Fei Shen
Xiao Han
34
5
0
27 May 2024
Lateralization MLP: A Simple Brain-inspired Architecture for Diffusion
Lateralization MLP: A Simple Brain-inspired Architecture for Diffusion
Zizhao Hu
Mohammad Rostami
34
0
0
25 May 2024
Modality-Collaborative Transformer with Hybrid Feature Reconstruction
  for Robust Emotion Recognition
Modality-Collaborative Transformer with Hybrid Feature Reconstruction for Robust Emotion Recognition
Chengxin Chen
Pengyuan Zhang
26
5
0
26 Dec 2023
Tuning-Free Inversion-Enhanced Control for Consistent Image Editing
Tuning-Free Inversion-Enhanced Control for Consistent Image Editing
Xiaoyue Duan
Shuhao Cui
Guoliang Kang
Baochang Zhang
Zhengcong Fei
Mingyuan Fan
Junshi Huang
DiffM
31
8
0
22 Dec 2023
Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following
Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following
Yutong Feng
Biao Gong
Di Chen
Yujun Shen
Yu Liu
Jingren Zhou
DiffM
26
43
0
28 Nov 2023
Tell2Design: A Dataset for Language-Guided Floor Plan Generation
Tell2Design: A Dataset for Language-Guided Floor Plan Generation
Sicong Leng
Yangqiaoyu Zhou
Mohammed Haroon Dupty
W. Lee
Sam Joyce
Wei Lu
3DV
27
10
0
27 Nov 2023
KV Inversion: KV Embeddings Learning for Text-Conditioned Real Image
  Action Editing
KV Inversion: KV Embeddings Learning for Text-Conditioned Real Image Action Editing
Jiarui Yao
Yifan Liu
Simon S. Du
Shifeng Chen
DiffM
16
24
0
28 Sep 2023
Exchanging-based Multimodal Fusion with Transformer
Exchanging-based Multimodal Fusion with Transformer
Renyu Zhu
Chengcheng Han
Yong Qian
Qiushi Sun
Xiang Li
Ming Gao
Xuezhi Cao
Yunsen Xian
23
2
0
05 Sep 2023
AI-Generated Content (AIGC) for Various Data Modalities: A Survey
AI-Generated Content (AIGC) for Various Data Modalities: A Survey
Lin Geng Foo
Hossein Rahmani
J. Liu
65
31
0
27 Aug 2023
Towards Discriminative Representations with Contrastive Instances for
  Real-Time UAV Tracking
Towards Discriminative Representations with Contrastive Instances for Real-Time UAV Tracking
Dan Zeng
Mingliang Zou
Xucheng Wang
Shuiwang Li
23
13
0
22 Aug 2023
Improving Tuning-Free Real Image Editing with Proximal Guidance
Improving Tuning-Free Real Image Editing with Proximal Guidance
Ligong Han
Song Wen
Qi Chen
Zhixing Zhang
Kunpeng Song
...
Qilong Zhangli
Jindong Jiang
Zhaoyang Xia
Akash Srivastava
Dimitris N. Metaxas
DiffM
22
56
0
08 Jun 2023
Differential Diffusion: Giving Each Pixel Its Strength
Differential Diffusion: Giving Each Pixel Its Strength
E. Levin
Ohad Fried
DiffM
37
20
0
01 Jun 2023
Gen-L-Video: Multi-Text to Long Video Generation via Temporal
  Co-Denoising
Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising
Fu Lee Wang
Wenshuo Chen
Guanglu Song
Han-Jia Ye
Yu Liu
Hongsheng Li
VGen
DiffM
33
88
0
29 May 2023
ALR-GAN: Adaptive Layout Refinement for Text-to-Image Synthesis
ALR-GAN: Adaptive Layout Refinement for Text-to-Image Synthesis
Hongchen Tan
Baocai Yin
Kun Wei
Xiuping Liu
Xin Li
13
16
0
13 Apr 2023
Toward Verifiable and Reproducible Human Evaluation for Text-to-Image
  Generation
Toward Verifiable and Reproducible Human Evaluation for Text-to-Image Generation
Mayu Otani
Riku Togashi
Yu Sawai
Ryosuke Ishigami
Yuta Nakashima
Esa Rahtu
J. Heikkilä
Shiníchi Satoh
28
62
0
04 Apr 2023
Freestyle Layout-to-Image Synthesis
Freestyle Layout-to-Image Synthesis
Han Xue
Z. Huang
Qianru Sun
Li-Na Song
Wenjun Zhang
DiffM
15
62
0
25 Mar 2023
MagicFusion: Boosting Text-to-Image Generation Performance by Fusing
  Diffusion Models
MagicFusion: Boosting Text-to-Image Generation Performance by Fusing Diffusion Models
Jing Zhao
Heliang Zheng
Chaoyue Wang
L. Lan
Wenjing Yang
VLM
38
17
0
23 Mar 2023
Flexible-modal Deception Detection with Audio-Visual Adapter
Flexible-modal Deception Detection with Audio-Visual Adapter
Zhaoxu Li
Zitong Yu
Nithish Muthuchamy Selvaraj
Xiaobao Guo
Bingquan Shen
A. Kong
Alex C. Kot
22
2
0
11 Feb 2023
Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image
  Diffusion Models
Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models
Hila Chefer
Yuval Alaluf
Yael Vinker
Lior Wolf
Daniel Cohen-Or
DiffM
53
498
0
31 Jan 2023
GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis
GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis
Ming Tao
Bingkun Bao
Hao Tang
Changsheng Xu
DiffM
VLM
63
100
0
30 Jan 2023
GLIGEN: Open-Set Grounded Text-to-Image Generation
GLIGEN: Open-Set Grounded Text-to-Image Generation
Yuheng Li
Haotian Liu
Qingyang Wu
Fangzhou Mu
Jianwei Yang
Jianfeng Gao
Chunyuan Li
Yong Jae Lee
VLM
49
568
1
17 Jan 2023
Multimodality Helps Unimodality: Cross-Modal Few-Shot Learning with
  Multimodal Models
Multimodality Helps Unimodality: Cross-Modal Few-Shot Learning with Multimodal Models
Zhiqiu Lin
Samuel Yu
Zhiyi Kuang
Deepak Pathak
Deva Ramana
VLM
15
100
0
16 Jan 2023
ANNA: Abstractive Text-to-Image Synthesis with Filtered News Captions
ANNA: Abstractive Text-to-Image Synthesis with Filtered News Captions
Aashish Anantha Ramakrishnan
Sharon X. Huang
Dongwon Lee
16
5
0
05 Jan 2023
Muse: Text-To-Image Generation via Masked Generative Transformers
Muse: Text-To-Image Generation via Masked Generative Transformers
Huiwen Chang
Han Zhang
Jarred Barber
AJ Maschinot
José Lezama
...
Kevin Patrick Murphy
William T. Freeman
Michael Rubinstein
Yuanzhen Li
Dilip Krishnan
DiffM
197
518
0
02 Jan 2023
SINE: SINgle Image Editing with Text-to-Image Diffusion Models
SINE: SINgle Image Editing with Text-to-Image Diffusion Models
Zhixing Zhang
Ligong Han
Arna Ghosh
Dimitris N. Metaxas
Jian Ren
DiffM
51
154
0
08 Dec 2022
Shifted Diffusion for Text-to-image Generation
Shifted Diffusion for Text-to-image Generation
Yufan Zhou
Bingchen Liu
Yizhe Zhu
Xiao Yang
Changyou Chen
Jinhui Xu
DiffM
22
39
0
24 Nov 2022
Make-A-Story: Visual Memory Conditioned Consistent Story Generation
Make-A-Story: Visual Memory Conditioned Consistent Story Generation
Tanzila Rahman
Hsin-Ying Lee
Jian Ren
Sergey Tulyakov
Shweta Mahajan
Leonid Sigal
DiffM
19
68
0
23 Nov 2022
ReCo: Region-Controlled Text-to-Image Generation
ReCo: Region-Controlled Text-to-Image Generation
Zhengyuan Yang
Jianfeng Wang
Zhe Gan
Linjie Li
Kevin Qinghong Lin
...
Nan Duan
Zicheng Liu
Ce Liu
Michael Zeng
Lijuan Wang
DiffM
26
140
0
23 Nov 2022
Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models
Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models
Xichen Pan
Pengda Qin
Yuhong Li
Hui Xue
Wenhu Chen
DiffM
16
62
0
20 Nov 2022
Complete Cross-triplet Loss in Label Space for Audio-visual Cross-modal
  Retrieval
Complete Cross-triplet Loss in Label Space for Audio-visual Cross-modal Retrieval
Donghuo Zeng
Yanan Wang
Jianming Wu
K. Ikeda
19
4
0
07 Nov 2022
Swinv2-Imagen: Hierarchical Vision Transformer Diffusion Models for
  Text-to-Image Generation
Swinv2-Imagen: Hierarchical Vision Transformer Diffusion Models for Text-to-Image Generation
Rui Li
Weihua Li
Yi Yang
Hanyu Wei
Jianhua Jiang
Quan-wei Bai
DiffM
19
11
0
18 Oct 2022
DE-FAKE: Detection and Attribution of Fake Images Generated by
  Text-to-Image Generation Models
DE-FAKE: Detection and Attribution of Fake Images Generated by Text-to-Image Generation Models
Zeyang Sha
Zheng Li
Ning Yu
Yang Zhang
DiffM
17
114
0
13 Oct 2022
Underspecification in Scene Description-to-Depiction Tasks
Underspecification in Scene Description-to-Depiction Tasks
Ben Hutchinson
Jason Baldridge
Vinodkumar Prabhakaran
DiffM
66
32
0
11 Oct 2022
ManiCLIP: Multi-Attribute Face Manipulation from Text
ManiCLIP: Multi-Attribute Face Manipulation from Text
Hao Wang
Guosheng Lin
A. Molino
Anran Wang
Jiashi Feng
Zehuan Yuan
CVBM
33
9
0
02 Oct 2022
Adma-GAN: Attribute-Driven Memory Augmented GANs for Text-to-Image
  Generation
Adma-GAN: Attribute-Driven Memory Augmented GANs for Text-to-Image Generation
Xintian Wu
Hanbin Zhao
Liangli Zheng
Shouhong Ding
Xi Li
29
13
0
28 Sep 2022
Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis
Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis
Wanshu Fan
Yen-Chun Chen
Dongdong Chen
Yu Cheng
Lu Yuan
Yu-Chiang Frank Wang
DiffM
18
90
0
29 Aug 2022
Text-to-Image Generation via Implicit Visual Guidance and Hypernetwork
Text-to-Image Generation via Implicit Visual Guidance and Hypernetwork
Xin Yuan
Zhe-nan Lin
Jason Kuen
Jianming Zhang
John Collomosse
27
5
0
17 Aug 2022
Word-Level Fine-Grained Story Visualization
Word-Level Fine-Grained Story Visualization
Bowen Li
Thomas Lukasiewicz
DiffM
3DH
31
24
0
03 Aug 2022
Design What You Desire: Icon Generation from Orthogonal Application and
  Theme Labels
Design What You Desire: Icon Generation from Orthogonal Application and Theme Labels
Yinpeng Chen
Zhiyu Pan
Min Shi
Hao Lu
Zhiguo Cao
Weicai Zhong
GAN
19
3
0
31 Jul 2022
ShapeCrafter: A Recursive Text-Conditioned 3D Shape Generation Model
ShapeCrafter: A Recursive Text-Conditioned 3D Shape Generation Model
Rao Fu
Xiaoyu Zhan
Yiwen Chen
Daniel E. Ritchie
Srinath Sridhar
29
79
0
19 Jul 2022
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Jiahui Yu
Yuanzhong Xu
Jing Yu Koh
Thang Luong
Gunjan Baid
...
Zarana Parekh
Xin Li
Han Zhang
Jason Baldridge
Yonghui Wu
EGVM
90
1,061
0
22 Jun 2022
Intra-agent speech permits zero-shot task acquisition
Intra-agent speech permits zero-shot task acquisition
Chen Yan
Federico Carnevale
Petko Georgiev
Adam Santoro
Aurelia Guy
Alistair Muldal
Chia-Chun Hung
Josh Abramson
Timothy Lillicrap
Greg Wayne
LM&Ro
36
9
0
07 Jun 2022
12
Next