ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.08402
  4. Cited By
LAION-5B: An open large-scale dataset for training next generation
  image-text models

LAION-5B: An open large-scale dataset for training next generation image-text models

16 October 2022
Christoph Schuhmann
Romain Beaumont
Richard Vencu
Cade Gordon
Ross Wightman
Mehdi Cherti
Theo Coombes
Aarush Katta
Clayton Mullis
Mitchell Wortsman
P. Schramowski
Srivatsa Kundurthy
Katherine Crowson
Ludwig Schmidt
R. Kaczmarczyk
J. Jitsev
    VLM
    MLLM
    CLIP
ArXivPDFHTML

Papers citing "LAION-5B: An open large-scale dataset for training next generation image-text models"

50 / 505 papers shown
Title
VisIT-Bench: A Benchmark for Vision-Language Instruction Following
  Inspired by Real-World Use
VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use
Yonatan Bitton
Hritik Bansal
Jack Hessel
Rulin Shao
Wanrong Zhu
Anas Awadalla
Josh Gardner
Rohan Taori
L. Schimdt
VLM
29
77
0
12 Aug 2023
Few-shot medical image classification with simple shape and texture text
  descriptors using vision-language models
Few-shot medical image classification with simple shape and texture text descriptors using vision-language models
Michal Byra
M. F. Rachmadi
Henrik Skibbe
VLM
28
6
0
08 Aug 2023
Bridging the Gap: Exploring the Capabilities of Bridge-Architectures for
  Complex Visual Reasoning Tasks
Bridging the Gap: Exploring the Capabilities of Bridge-Architectures for Complex Visual Reasoning Tasks
Kousik Rajesh
Mrigank Raman
M. A. Karim
Pranit Chawla
VLM
23
2
0
31 Jul 2023
Foundational Models Defining a New Era in Vision: A Survey and Outlook
Foundational Models Defining a New Era in Vision: A Survey and Outlook
Muhammad Awais
Muzammal Naseer
Salman Khan
Rao Muhammad Anwer
Hisham Cholakkal
M. Shah
Ming Yang
F. Khan
VLM
18
117
0
25 Jul 2023
Towards a Visual-Language Foundation Model for Computational Pathology
Towards a Visual-Language Foundation Model for Computational Pathology
Ming Y. Lu
Bowen Chen
Drew F. K. Williamson
Richard J. Chen
Ivy Liang
...
Andrew Zhang
L. Le
Georg Gerber
Anil V. Parwani
Faisal Mahmood
VLM
MedIm
33
46
0
24 Jul 2023
Latent Code Augmentation Based on Stable Diffusion for Data-free
  Substitute Attacks
Latent Code Augmentation Based on Stable Diffusion for Data-free Substitute Attacks
Mingwen Shao
Lingzhuang Meng
Yuanjian Qiao
Lixu Zhang
W. Zuo
DiffM
19
0
0
24 Jul 2023
What Can Simple Arithmetic Operations Do for Temporal Modeling?
What Can Simple Arithmetic Operations Do for Temporal Modeling?
Wenhao Wu
Yuxin Song
Zhun Sun
Jingdong Wang
Chang Xu
Wanli Ouyang
38
8
0
18 Jul 2023
Image Captions are Natural Prompts for Text-to-Image Models
Image Captions are Natural Prompts for Text-to-Image Models
Shiye Lei
Hao Chen
Senyang Zhang
Bo-Lu Zhao
Dacheng Tao
VLM
24
19
0
17 Jul 2023
Zero-Shot Image Harmonization with Generative Model Prior
Zero-Shot Image Harmonization with Generative Model Prior
Jianqi Chen
Yilan Zhang
Zhengxia Zou
Keyan Chen
Z. Shi
DiffM
24
5
0
17 Jul 2023
An Empirical Study of Pre-trained Model Selection for Out-of-Distribution Generalization and Calibration
An Empirical Study of Pre-trained Model Selection for Out-of-Distribution Generalization and Calibration
Hiroki Naganuma
Ryuichiro Hataya
Kotaro Yoshida
Ioannis Mitliagkas
OODD
84
1
0
17 Jul 2023
MultiVENT: Multilingual Videos of Events with Aligned Natural Text
MultiVENT: Multilingual Videos of Events with Aligned Natural Text
Kate Sanders
David Etter
Reno Kriz
Benjamin Van Durme
VGen
26
7
0
06 Jul 2023
Advancing Zero-Shot Digital Human Quality Assessment through
  Text-Prompted Evaluation
Advancing Zero-Shot Digital Human Quality Assessment through Text-Prompted Evaluation
Zicheng Zhang
Wei Sun
Yingjie Zhou
Haoning Wu
Chunyi Li
Xiongkuo Min
Xiaohong Liu
Guangtao Zhai
Weisi Lin
17
34
0
06 Jul 2023
Collaborative Score Distillation for Consistent Visual Synthesis
Collaborative Score Distillation for Consistent Visual Synthesis
Subin Kim
Kyungmin Lee
June Suk Choi
Jongheon Jeong
Kihyuk Sohn
Jinwoo Shin
DiffM
19
21
0
04 Jul 2023
When Foundation Model Meets Federated Learning: Motivations, Challenges, and Future Directions
When Foundation Model Meets Federated Learning: Motivations, Challenges, and Future Directions
Weiming Zhuang
Chen Chen
Lingjuan Lyu
C. L. P. Chen
Yaochu Jin
Lingjuan Lyu
AIFin
AI4CE
86
85
0
27 Jun 2023
VisoGender: A dataset for benchmarking gender bias in image-text pronoun
  resolution
VisoGender: A dataset for benchmarking gender bias in image-text pronoun resolution
S. Hall
F. G. Abrantes
Hanwen Zhu
Grace A. Sodunke
Aleksandar Shtedritski
Hannah Rose Kirk
CoGe
11
38
0
21 Jun 2023
Quilt-1M: One Million Image-Text Pairs for Histopathology
Quilt-1M: One Million Image-Text Pairs for Histopathology
Wisdom O. Ikezogwo
M. S. Seyfioglu
Fatemeh Ghezloo
Dylan Stefan Chan Geva
Fatwir Sheikh Mohammed
Pavan Kumar Anand
Ranjay Krishna
Linda G. Shapiro
CLIP
VLM
128
109
0
20 Jun 2023
Human Preference Score v2: A Solid Benchmark for Evaluating Human
  Preferences of Text-to-Image Synthesis
Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis
Xiaoshi Wu
Yiming Hao
Keqiang Sun
Yixiong Chen
Feng Zhu
Rui Zhao
Hongsheng Li
34
251
0
15 Jun 2023
Training-free Diffusion Model Adaptation for Variable-Sized
  Text-to-Image Synthesis
Training-free Diffusion Model Adaptation for Variable-Sized Text-to-Image Synthesis
Zhiyu Jin
Xuli Shen
Bin Li
Xiangyang Xue
18
36
0
14 Jun 2023
GeneCIS: A Benchmark for General Conditional Image Similarity
GeneCIS: A Benchmark for General Conditional Image Similarity
S. Vaze
Nicolas Carion
Ishan Misra
VLM
DiffM
27
26
0
13 Jun 2023
Sticker820K: Empowering Interactive Retrieval with Stickers
Sticker820K: Empowering Interactive Retrieval with Stickers
Sijie Zhao
Yixiao Ge
Zhongang Qi
Lin Song
Xiaohan Ding
Zehua Xie
Ying Shan
20
6
0
12 Jun 2023
Transferring Foundation Models for Generalizable Robotic Manipulation
Transferring Foundation Models for Generalizable Robotic Manipulation
Jiange Yang
Wenhui Tan
Chuhao Jin
Keling Yao
Bei Liu
Jianlong Fu
Ruihua Song
Gangshan Wu
Limin Wang
LM&Ro
47
6
0
09 Jun 2023
Improving neural network representations using human similarity
  judgments
Improving neural network representations using human similarity judgments
Lukas Muttenthaler
Lorenz Linhardt
Jonas Dippel
Robert A. Vandermeulen
Katherine L. Hermann
Andrew Kyle Lampinen
Simon Kornblith
37
29
0
07 Jun 2023
LRVS-Fashion: Extending Visual Search with Referring Instructions
LRVS-Fashion: Extending Visual Search with Referring Instructions
Simon Lepage
Jérémie Mary
David Picard
18
1
0
05 Jun 2023
Understanding and Mitigating Copying in Diffusion Models
Understanding and Mitigating Copying in Diffusion Models
Gowthami Somepalli
Vasu Singla
Micah Goldblum
Jonas Geiping
Tom Goldstein
DiffM
16
125
0
31 May 2023
Controllable Text-to-Image Generation with GPT-4
Controllable Text-to-Image Generation with GPT-4
Tianjun Zhang
Yi Zhang
Vibhav Vineet
Neel Joshi
Xin Eric Wang
DiffM
16
41
0
29 May 2023
Reconstructing the Mind's Eye: fMRI-to-Image with Contrastive Learning
  and Diffusion Priors
Reconstructing the Mind's Eye: fMRI-to-Image with Contrastive Learning and Diffusion Priors
Paul S. Scotti
Atmadeep Banerjee
J. Goode
Stepan Shabalin
A. Nguyen
...
Nathalie Verlinde
Elad Yundler
David Weisberg
K. A. Norman
Tanishq Mathew Abraham
DiffM
32
106
0
29 May 2023
Gen-L-Video: Multi-Text to Long Video Generation via Temporal
  Co-Denoising
Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising
Fu Lee Wang
Wenshuo Chen
Guanglu Song
Han-Jia Ye
Yu Liu
Hongsheng Li
VGen
DiffM
33
88
0
29 May 2023
Mitigating Inappropriateness in Image Generation: Can there be Value in
  Reflecting the World's Ugliness?
Mitigating Inappropriateness in Image Generation: Can there be Value in Reflecting the World's Ugliness?
Manuel Brack
Felix Friedrich
P. Schramowski
Kristian Kersting
EGVM
18
13
0
28 May 2023
MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal
  Image Generation
MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation
Marco Bellagente
Manuel Brack
H. Teufel
Felix Friedrich
Bjorn Deiseroth
...
Koen Oostermeijer
Andres Felipe Cruz Salinas
P. Schramowski
Kristian Kersting
Samuel Weinbach
36
15
0
24 May 2023
In-Context Impersonation Reveals Large Language Models' Strengths and
  Biases
In-Context Impersonation Reveals Large Language Models' Strengths and Biases
Leonard Salewski
Stephan Alaniz
Isabel Rio-Torto
Eric Schulz
Zeynep Akata
22
148
0
24 May 2023
Diffusion Hyperfeatures: Searching Through Time and Space for Semantic
  Correspondence
Diffusion Hyperfeatures: Searching Through Time and Space for Semantic Correspondence
Grace Luo
Lisa Dunlap
Dong Huk Park
Aleksander Holynski
Trevor Darrell
26
118
0
23 May 2023
i-Code V2: An Autoregressive Generation Framework over Vision, Language,
  and Speech Data
i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data
Ziyi Yang
Mahmoud Khademi
Yichong Xu
Reid Pryzant
Yuwei Fang
...
Yu Shi
Lu Yuan
Takuya Yoshioka
Michael Zeng
Xuedong Huang
17
2
0
21 May 2023
Data Redaction from Conditional Generative Models
Data Redaction from Conditional Generative Models
Zhifeng Kong
Kamalika Chaudhuri
KELM
16
7
0
18 May 2023
ONE-PEACE: Exploring One General Representation Model Toward Unlimited
  Modalities
ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
Peng Wang
Shijie Wang
Junyang Lin
Shuai Bai
Xiaohuan Zhou
Jingren Zhou
Xinggang Wang
Chang Zhou
VLM
MLLM
ObjD
16
114
0
18 May 2023
OpenShape: Scaling Up 3D Shape Representation Towards Open-World
  Understanding
OpenShape: Scaling Up 3D Shape Representation Towards Open-World Understanding
Minghua Liu
Ruoxi Shi
Kaiming Kuang
Yinhao Zhu
Xuanlin Li
Shizhong Han
H. Cai
Fatih Porikli
Hao Su
3DPC
27
116
0
18 May 2023
DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining
DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining
Sang Michael Xie
Hieu H. Pham
Xuanyi Dong
Nan Du
Hanxiao Liu
Yifeng Lu
Percy Liang
Quoc V. Le
Tengyu Ma
Adams Wei Yu
MoMe
MoE
25
172
0
17 May 2023
Common Diffusion Noise Schedules and Sample Steps are Flawed
Common Diffusion Noise Schedules and Sample Steps are Flawed
Shanchuan Lin
Bingchen Liu
Jiashi Li
Xiao Yang
DiffM
14
200
0
15 May 2023
Self-Chained Image-Language Model for Video Localization and Question
  Answering
Self-Chained Image-Language Model for Video Localization and Question Answering
Shoubin Yu
Jaemin Cho
Prateek Yadav
Mohit Bansal
36
129
0
11 May 2023
Text-to-Image Diffusion Models can be Easily Backdoored through
  Multimodal Data Poisoning
Text-to-Image Diffusion Models can be Easily Backdoored through Multimodal Data Poisoning
Shengfang Zhai
Yinpeng Dong
Qingni Shen
Shih-Chieh Pu
Yuejian Fang
Hang Su
30
70
0
07 May 2023
DINOv2: Learning Robust Visual Features without Supervision
DINOv2: Learning Robust Visual Features without Supervision
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
...
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
VLM
CLIP
SSL
49
3,011
0
14 Apr 2023
Memory Efficient Diffusion Probabilistic Models via Patch-based
  Generation
Memory Efficient Diffusion Probabilistic Models via Patch-based Generation
Shinei Arakawa
Hideki Tsunashima
Daichi Horita
Keitaro Tanaka
Shigeo Morishima
DiffM
8
3
0
14 Apr 2023
On the Opportunities and Challenges of Foundation Models for Geospatial
  Artificial Intelligence
On the Opportunities and Challenges of Foundation Models for Geospatial Artificial Intelligence
Gengchen Mai
Weiming Huang
Jin Sun
Suhang Song
Deepak Mishra
...
Yingjie Hu
Chris Cundy
Ziyuan Li
Rui Zhu
Ni Lao
AI4CE
22
118
0
13 Apr 2023
Expressive Text-to-Image Generation with Rich Text
Expressive Text-to-Image Generation with Rich Text
Songwei Ge
Taesung Park
Jun-Yan Zhu
Jia-Bin Huang
DiffM
77
79
0
13 Apr 2023
Control3Diff: Learning Controllable 3D Diffusion Models from Single-view
  Images
Control3Diff: Learning Controllable 3D Diffusion Models from Single-view Images
Jiatao Gu
Qingzhe Gao
Shuangfei Zhai
Baoquan Chen
Lingjie Liu
J. Susskind
28
29
0
13 Apr 2023
HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image
  Models
HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models
Eslam Mohamed Bakr
Pengzhan Sun
Xiaoqian Shen
Faizan Farooq Khan
Li Erran Li
Mohamed Elhoseiny
VLM
11
76
0
11 Apr 2023
Towards Real-time Text-driven Image Manipulation with Unconditional
  Diffusion Models
Towards Real-time Text-driven Image Manipulation with Unconditional Diffusion Models
Nikita Starodubcev
Dmitry Baranchuk
Valentin Khrulkov
Artem Babenko
DiffM
45
4
0
10 Apr 2023
Exploring Vision-Language Models for Imbalanced Learning
Exploring Vision-Language Models for Imbalanced Learning
Yidong Wang
Zhuohao Yu
Jindong Wang
Qiang Heng
Haoxing Chen
Wei Ye
Rui Xie
Xingxu Xie
Shi-Bo Zhang
VLM
26
30
0
04 Apr 2023
Parents and Children: Distinguishing Multimodal DeepFakes from Natural
  Images
Parents and Children: Distinguishing Multimodal DeepFakes from Natural Images
Roberto Amoroso
Davide Morelli
Marcella Cornia
Lorenzo Baraldi
A. Bimbo
Rita Cucchiara
DiffM
29
29
0
02 Apr 2023
Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models
Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models
Eric Zhang
Kai Wang
Xingqian Xu
Zhangyang Wang
Humphrey Shi
DiffM
42
172
0
30 Mar 2023
Your Diffusion Model is Secretly a Zero-Shot Classifier
Your Diffusion Model is Secretly a Zero-Shot Classifier
Alexander C. Li
Mihir Prabhudesai
Shivam Duggal
Ellis L Brown
Deepak Pathak
DiffM
VLM
33
223
0
28 Mar 2023
Previous
123...101189
Next