ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.08402
  4. Cited By
LAION-5B: An open large-scale dataset for training next generation
  image-text models

LAION-5B: An open large-scale dataset for training next generation image-text models

16 October 2022
Christoph Schuhmann
Romain Beaumont
Richard Vencu
Cade Gordon
Ross Wightman
Mehdi Cherti
Theo Coombes
Aarush Katta
Clayton Mullis
Mitchell Wortsman
P. Schramowski
Srivatsa Kundurthy
Katherine Crowson
Ludwig Schmidt
R. Kaczmarczyk
J. Jitsev
    VLM
    MLLM
    CLIP
ArXivPDFHTML

Papers citing "LAION-5B: An open large-scale dataset for training next generation image-text models"

50 / 505 papers shown
Title
Anti-DreamBooth: Protecting users from personalized text-to-image
  synthesis
Anti-DreamBooth: Protecting users from personalized text-to-image synthesis
T. Le
Hao Phung
Thuan Hoang Nguyen
Quan Dao
Ngoc N. Tran
Anh Tran
19
91
0
27 Mar 2023
EVA-CLIP: Improved Training Techniques for CLIP at Scale
EVA-CLIP: Improved Training Techniques for CLIP at Scale
Quan-Sen Sun
Yuxin Fang
Ledell Yu Wu
Xinlong Wang
Yue Cao
CLIP
VLM
52
463
0
27 Mar 2023
Sigmoid Loss for Language Image Pre-Training
Sigmoid Loss for Language Image Pre-Training
Xiaohua Zhai
Basil Mustafa
Alexander Kolesnikov
Lucas Beyer
CLIP
VLM
19
931
0
27 Mar 2023
Freestyle Layout-to-Image Synthesis
Freestyle Layout-to-Image Synthesis
Han Xue
Z. Huang
Qianru Sun
Li-Na Song
Wenjun Zhang
DiffM
15
62
0
25 Mar 2023
ReVersion: Diffusion-Based Relation Inversion from Images
ReVersion: Diffusion-Based Relation Inversion from Images
Ziqi Huang
Tianxing Wu
Yuming Jiang
Kelvin C. K. Chan
Ziwei Liu
25
65
0
23 Mar 2023
MagicFusion: Boosting Text-to-Image Generation Performance by Fusing
  Diffusion Models
MagicFusion: Boosting Text-to-Image Generation Performance by Fusing Diffusion Models
Jing Zhao
Heliang Zheng
Chaoyue Wang
L. Lan
Wenjing Yang
VLM
38
17
0
23 Mar 2023
Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models
Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models
Lukas Höllein
Ang Cao
Andrew Owens
Justin Johnson
Matthias Nießner
DiffM
30
177
0
21 Mar 2023
DiffuMask: Synthesizing Images with Pixel-level Annotations for Semantic
  Segmentation Using Diffusion Models
DiffuMask: Synthesizing Images with Pixel-level Annotations for Semantic Segmentation Using Diffusion Models
Weijia Wu
Yuzhong Zhao
Mike Zheng Shou
Hong Zhou
Chunhua Shen
31
140
0
21 Mar 2023
GlueGen: Plug and Play Multi-modal Encoders for X-to-image Generation
GlueGen: Plug and Play Multi-modal Encoders for X-to-image Generation
Can Qin
Ning Yu
Chen Xing
Shu Zhen Zhang
Zeyuan Chen
Stefano Ermon
Yun Fu
Caiming Xiong
Ran Xu
DiffM
30
19
0
17 Mar 2023
Aerial Diffusion: Text Guided Ground-to-Aerial View Translation from a
  Single Image using Diffusion Models
Aerial Diffusion: Text Guided Ground-to-Aerial View Translation from a Single Image using Diffusion Models
D. Kothandaraman
Tianyi Zhou
Ming Lin
Dinesh Manocha
24
5
0
15 Mar 2023
Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D
  Generation
Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D Generation
Junyoung Seo
Wooseok Jang
Minseop Kwak
Ines Hyeonsu Kim
Jaehoon Ko
Junho Kim
Jin-Hwa Kim
Jiyoung Lee
Seung Wook Kim
DiffM
30
135
0
14 Mar 2023
Unleashing Text-to-Image Diffusion Models for Visual Perception
Unleashing Text-to-Image Diffusion Models for Visual Perception
Wenliang Zhao
Yongming Rao
Zuyan Liu
Benlin Liu
Jie Zhou
Jiwen Lu
ObjD
VLM
MDE
158
214
0
03 Mar 2023
Modulating Pretrained Diffusion Models for Multimodal Image Synthesis
Modulating Pretrained Diffusion Models for Multimodal Image Synthesis
Cusuh Ham
James Hays
Jingwan Lu
Krishna Kumar Singh
Zhifei Zhang
Tobias Hinz
DiffM
17
24
0
24 Feb 2023
Poisoning Web-Scale Training Datasets is Practical
Poisoning Web-Scale Training Datasets is Practical
Nicholas Carlini
Matthew Jagielski
Christopher A. Choquette-Choo
Daniel Paleka
Will Pearce
Hyrum S. Anderson
Andreas Terzis
Kurt Thomas
Florian Tramèr
SILM
24
181
0
20 Feb 2023
MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation
MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation
Omer Bar-Tal
Lior Yariv
Y. Lipman
Tali Dekel
45
364
1
16 Feb 2023
A Friendly Face: Do Text-to-Image Systems Rely on Stereotypes when the
  Input is Under-Specified?
A Friendly Face: Do Text-to-Image Systems Rely on Stereotypes when the Input is Under-Specified?
Kathleen C. Fraser
S. Kiritchenko
I. Nejadgholi
DiffM
24
36
0
14 Feb 2023
Zero-shot Generation of Coherent Storybook from Plain Text Story using
  Diffusion Models
Zero-shot Generation of Coherent Storybook from Plain Text Story using Diffusion Models
Hyeonho Jeong
Gihyun Kwon
Jong Chul Ye
24
20
0
08 Feb 2023
Mixture of Diffusers for scene composition and high resolution image
  generation
Mixture of Diffusers for scene composition and high resolution image generation
Á. Jiménez
DiffM
13
45
0
05 Feb 2023
Eliminating Contextual Prior Bias for Semantic Image Editing via
  Dual-Cycle Diffusion
Eliminating Contextual Prior Bias for Semantic Image Editing via Dual-Cycle Diffusion
Zuopeng Yang
Tianshu Chu
Xin Lin
Erdun Gao
Daqing Liu
J. Yang
Chaoyue Wang
DiffM
21
16
0
05 Feb 2023
Contrast with Reconstruct: Contrastive 3D Representation Learning Guided
  by Generative Pretraining
Contrast with Reconstruct: Contrastive 3D Representation Learning Guided by Generative Pretraining
Zekun Qi
Runpei Dong
Guo Fan
Zheng Ge
Xiangyu Zhang
Kaisheng Ma
Li Yi
28
117
0
05 Feb 2023
TEXTure: Text-Guided Texturing of 3D Shapes
TEXTure: Text-Guided Texturing of 3D Shapes
Elad Richardson
G. Metzer
Yuval Alaluf
Raja Giryes
Daniel Cohen-Or
DiffM
31
258
0
03 Feb 2023
Debiasing Vision-Language Models via Biased Prompts
Debiasing Vision-Language Models via Biased Prompts
Ching-Yao Chuang
Varun Jampani
Yuanzhen Li
Antonio Torralba
Stefanie Jegelka
VLM
23
96
0
31 Jan 2023
Discovering and Mitigating Visual Biases through Keyword Explanation
Discovering and Mitigating Visual Biases through Keyword Explanation
Younghyun Kim
Sangwoo Mo
Minkyu Kim
Kyungmin Lee
Jaeho Lee
Jinwoo Shin
26
30
0
26 Jan 2023
RILS: Masked Visual Reconstruction in Language Semantic Space
RILS: Masked Visual Reconstruction in Language Semantic Space
Shusheng Yang
Yixiao Ge
Kun Yi
Dian Li
Ying Shan
Xiaohu Qie
Xinggang Wang
CLIP
32
11
0
17 Jan 2023
Diffusing Surrogate Dreams of Video Scenes to Predict Video Memorability
Diffusing Surrogate Dreams of Video Scenes to Predict Video Memorability
Lorin Sweeney
Graham Healy
A. Smeaton
DiffM
20
2
0
19 Dec 2022
Transferring General Multimodal Pretrained Models to Text Recognition
Transferring General Multimodal Pretrained Models to Text Recognition
Junyang Lin
Xuancheng Ren
Yichang Zhang
Gao Liu
Peng Wang
An Yang
Chang Zhou
32
4
0
19 Dec 2022
Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D
  Generation
Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D Generation
Haochen Wang
Xiaodan Du
Jiahao Li
Raymond A. Yeh
Gregory Shakhnarovich
DiffM
43
527
0
01 Dec 2022
One-shot recognition of any material anywhere using contrastive learning
  with physics-based rendering
One-shot recognition of any material anywhere using contrastive learning with physics-based rendering
Manuel S. Drehwald
S. Eppel
Jolina Li
Han Hao
Alán Aspuru-Guzik
17
6
0
01 Dec 2022
DATID-3D: Diversity-Preserved Domain Adaptation Using Text-to-Image
  Diffusion for 3D Generative Model
DATID-3D: Diversity-Preserved Domain Adaptation Using Text-to-Image Diffusion for 3D Generative Model
Gwanghyun Kim
S. Chun
DiffM
18
39
0
29 Nov 2022
Context-Aware Robust Fine-Tuning
Context-Aware Robust Fine-Tuning
Xiaofeng Mao
YueFeng Chen
Xiaojun Jia
Rong Zhang
Hui Xue
Zhao Li
VLM
CLIP
20
23
0
29 Nov 2022
Peekaboo: Text to Image Diffusion Models are Zero-Shot Segmentors
Peekaboo: Text to Image Diffusion Models are Zero-Shot Segmentors
R. Burgert
Kanchana Ranasinghe
Xiang Li
Michael S. Ryoo
DiffM
VLM
27
37
0
23 Nov 2022
ReCo: Region-Controlled Text-to-Image Generation
ReCo: Region-Controlled Text-to-Image Generation
Zhengyuan Yang
Jianfeng Wang
Zhe Gan
Linjie Li
Kevin Qinghong Lin
...
Nan Duan
Zicheng Liu
Ce Liu
Michael Zeng
Lijuan Wang
DiffM
26
140
0
23 Nov 2022
Open-vocabulary Attribute Detection
Open-vocabulary Attribute Detection
M. A. Bravo
Sudhanshu Mittal
Simon Ging
Thomas Brox
VLM
ObjD
14
30
0
23 Nov 2022
Investigating Prompt Engineering in Diffusion Models
Investigating Prompt Engineering in Diffusion Models
Sam Witteveen
Martin Andrews
11
57
0
21 Nov 2022
Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models
Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models
Xichen Pan
Pengda Qin
Yuhong Li
Hui Xue
Wenhu Chen
DiffM
16
62
0
20 Nov 2022
GLAMI-1M: A Multilingual Image-Text Fashion Dataset
GLAMI-1M: A Multilingual Image-Text Fashion Dataset
Vaclav Kosar
A. Hoskovec
Milan Šulc
Radek Bartyzal
VLM
22
3
0
17 Nov 2022
EVA: Exploring the Limits of Masked Visual Representation Learning at
  Scale
EVA: Exploring the Limits of Masked Visual Representation Learning at Scale
Yuxin Fang
Wen Wang
Binhui Xie
Quan-Sen Sun
Ledell Yu Wu
Xinggang Wang
Tiejun Huang
Xinlong Wang
Yue Cao
VLM
CLIP
54
673
0
14 Nov 2022
Safe Latent Diffusion: Mitigating Inappropriate Degeneration in
  Diffusion Models
Safe Latent Diffusion: Mitigating Inappropriate Degeneration in Diffusion Models
P. Schramowski
Manuel Brack
Bjorn Deiseroth
Kristian Kersting
32
269
0
09 Nov 2022
Efficient Spatially Sparse Inference for Conditional GANs and Diffusion
  Models
Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models
Muyang Li
Ji Lin
Chenlin Meng
Stefano Ermon
Song Han
Jun-Yan Zhu
DiffM
30
45
0
03 Nov 2022
DiffusionDB: A Large-scale Prompt Gallery Dataset for Text-to-Image
  Generative Models
DiffusionDB: A Large-scale Prompt Gallery Dataset for Text-to-Image Generative Models
Zijie J. Wang
Evan Montoya
David Munechika
Haoyang Yang
Benjamin Hoover
Duen Horng Chau
19
287
0
26 Oct 2022
Conditional Diffusion with Less Explicit Guidance via Model Predictive
  Control
Conditional Diffusion with Less Explicit Guidance via Model Predictive Control
Max W. Shen
Ehsan Hajiramezanali
Gabriele Scalia
Alex Tseng
N. Diamant
Tommaso Biancalani
Andreas Loukas
34
1
0
21 Oct 2022
1st Place Solution in Google Universal Images Embedding
1st Place Solution in Google Universal Images Embedding
Shihao Shao
Qinghua Cui
3DGS
17
7
0
16 Oct 2022
DreamFusion: Text-to-3D using 2D Diffusion
DreamFusion: Text-to-3D using 2D Diffusion
Ben Poole
Ajay Jain
Jonathan T. Barron
B. Mildenhall
47
2,304
0
29 Sep 2022
LANTERN-RD: Enabling Deep Learning for Mitigation of the Invasive
  Spotted Lanternfly
LANTERN-RD: Enabling Deep Learning for Mitigation of the Invasive Spotted Lanternfly
Srivatsa Kundurthy
11
1
0
12 May 2022
CoWs on Pasture: Baselines and Benchmarks for Language-Driven Zero-Shot
  Object Navigation
CoWs on Pasture: Baselines and Benchmarks for Language-Driven Zero-Shot Object Navigation
S. Gadre
Mitchell Wortsman
Gabriel Ilharco
Ludwig Schmidt
Shuran Song
CLIP
LM&Ro
25
142
0
20 Mar 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
390
4,110
0
28 Jan 2022
General Facial Representation Learning in a Visual-Linguistic Manner
General Facial Representation Learning in a Visual-Linguistic Manner
Yinglin Zheng
Hao Yang
Ting Zhang
Jianmin Bao
Dongdong Chen
Yangyu Huang
Lu Yuan
Dong Chen
Ming Zeng
Fang Wen
CVBM
135
162
0
06 Dec 2021
How Much Can CLIP Benefit Vision-and-Language Tasks?
How Much Can CLIP Benefit Vision-and-Language Tasks?
Sheng Shen
Liunian Harold Li
Hao Tan
Mohit Bansal
Anna Rohrbach
Kai-Wei Chang
Z. Yao
Kurt Keutzer
CLIP
VLM
MLLM
188
403
0
13 Jul 2021
WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual
  Machine Learning
WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual Machine Learning
Krishna Srinivasan
K. Raman
Jiecao Chen
Michael Bendersky
Marc Najork
VLM
197
308
0
02 Mar 2021
Zero-Shot Text-to-Image Generation
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,764
0
24 Feb 2021
Previous
123...10119
Next