ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.08402
  4. Cited By
LAION-5B: An open large-scale dataset for training next generation
  image-text models

LAION-5B: An open large-scale dataset for training next generation image-text models

16 October 2022
Christoph Schuhmann
Romain Beaumont
Richard Vencu
Cade Gordon
Ross Wightman
Mehdi Cherti
Theo Coombes
Aarush Katta
Clayton Mullis
Mitchell Wortsman
P. Schramowski
Srivatsa Kundurthy
Katherine Crowson
Ludwig Schmidt
R. Kaczmarczyk
J. Jitsev
    VLM
    MLLM
    CLIP
ArXivPDFHTML

Papers citing "LAION-5B: An open large-scale dataset for training next generation image-text models"

50 / 520 papers shown
Title
Younger: The First Dataset for Artificial Intelligence-Generated Neural
  Network Architecture
Younger: The First Dataset for Artificial Intelligence-Generated Neural Network Architecture
Zhengxin Yang
Wanling Gao
Luzhou Peng
Yunyou Huang
Fei Tang
Jianfeng Zhan
31
0
0
20 Jun 2024
Fantastic Copyrighted Beasts and How (Not) to Generate Them
Fantastic Copyrighted Beasts and How (Not) to Generate Them
Luxi He
Yangsibo Huang
Weijia Shi
Tinghao Xie
Haotian Liu
Yue Wang
Luke Zettlemoyer
Chiyuan Zhang
Danqi Chen
Peter Henderson
44
9
0
20 Jun 2024
SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Model
SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Model
Yongting Zhang
Lu Chen
Guodong Zheng
Yifeng Gao
Rui Zheng
...
Yu Qiao
Xuanjing Huang
Feng Zhao
Tao Gui
Jing Shao
VLM
75
23
0
17 Jun 2024
CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and Generation
CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and Generation
Wei Chen
Lin Li
Yongqi Yang
Bin Wen
Fan Yang
Tingting Gao
Yu Wu
Long Chen
VLM
VGen
43
6
0
15 Jun 2024
SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages
SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages
Holy Lovenia
Rahmad Mahendra
Salsabil Maulana Akbar
Lester James Validad Miranda
Jennifer Santoso
...
Genta Indra Winata
Ruochen Zhang
Fajri Koto
Zheng-Xin Yong
Samuel Cahyawijaya
77
9
0
14 Jun 2024
What If We Recaption Billions of Web Images with LLaMA-3?
What If We Recaption Billions of Web Images with LLaMA-3?
Xianhang Li
Haoqin Tu
Mude Hui
Zeyu Wang
Bingchen Zhao
...
Jieru Mei
Qing Liu
Huangjie Zheng
Yuyin Zhou
Cihang Xie
VLM
MLLM
36
35
0
12 Jun 2024
An Image is Worth 32 Tokens for Reconstruction and Generation
An Image is Worth 32 Tokens for Reconstruction and Generation
Qihang Yu
Mark Weber
XueQing Deng
Xiaohui Shen
Daniel Cremers
Liang-Chieh Chen
VLM
ViT
44
79
0
11 Jun 2024
MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal
  Large Language Models
MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal Large Language Models
Tianle Gu
Zeyang Zhou
Kexin Huang
Dandan Liang
Yixu Wang
...
Keqing Wang
Yujiu Yang
Yan Teng
Yu Qiao
Yingchun Wang
ELM
42
9
0
11 Jun 2024
Is One GPU Enough? Pushing Image Generation at Higher-Resolutions with
  Foundation Models
Is One GPU Enough? Pushing Image Generation at Higher-Resolutions with Foundation Models
Athanasios Tragakis
Marco Aversa
Chaitanya Kaul
Roderick Murray-Smith
Daniele Faccio
49
2
0
11 Jun 2024
Haptic Repurposing with GenAI
Haptic Repurposing with GenAI
Haoyu Wang
34
0
0
11 Jun 2024
Towards Semantic Equivalence of Tokenization in Multimodal LLM
Towards Semantic Equivalence of Tokenization in Multimodal LLM
Shengqiong Wu
Hao Fei
Xiangtai Li
Jiayi Ji
Hanwang Zhang
Tat-Seng Chua
Shuicheng Yan
MLLM
59
31
0
07 Jun 2024
Bayesian Power Steering: An Effective Approach for Domain Adaptation of
  Diffusion Models
Bayesian Power Steering: An Effective Approach for Domain Adaptation of Diffusion Models
Ding Huang
Ting Li
Jian Huang
DiffM
31
1
0
06 Jun 2024
Interpreting the Second-Order Effects of Neurons in CLIP
Interpreting the Second-Order Effects of Neurons in CLIP
Yossi Gandelsman
Alexei A. Efros
Jacob Steinhardt
MILM
48
16
0
06 Jun 2024
VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling
VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling
Zeyue Tian
Zhaoyang Liu
Ruibin Yuan
Jiahao Pan
Xiaoqiang Huang
Xu Tan
Xu Tan
Qifeng Chen
Y. Guo
VGen
100
16
0
06 Jun 2024
Balancing Performance and Efficiency in Zero-shot Robotic Navigation
Balancing Performance and Efficiency in Zero-shot Robotic Navigation
Dmytro Kuzmenko
N. Shvai
LM&Ro
20
0
0
05 Jun 2024
Inv-Adapter: ID Customization Generation via Image Inversion and
  Lightweight Adapter
Inv-Adapter: ID Customization Generation via Image Inversion and Lightweight Adapter
Peng-Fei Xing
Ning Wang
Jianbo Ouyang
Zechao Li
DiffM
34
1
0
05 Jun 2024
Tiny models from tiny data: Textual and null-text inversion for few-shot distillation
Tiny models from tiny data: Textual and null-text inversion for few-shot distillation
Erik Landolsi
Fredrik Kahl
DiffM
53
1
0
05 Jun 2024
Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering
Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering
Xingrui Wang
Wufei Ma
Angtian Wang
Shuo Chen
Adam Kortylewski
Alan L. Yuille
29
3
0
02 Jun 2024
Don't drop your samples! Coherence-aware training benefits Conditional diffusion
Don't drop your samples! Coherence-aware training benefits Conditional diffusion
Nicolas Dufour
Victor Besnier
Vicky Kalogeiton
David Picard
DiffM
49
2
0
30 May 2024
CLIPLoss and Norm-Based Data Selection Methods for Multimodal
  Contrastive Learning
CLIPLoss and Norm-Based Data Selection Methods for Multimodal Contrastive Learning
Yiping Wang
Yifang Chen
Wendan Yan
Alex Fang
Wenjing Zhou
Kevin G. Jamieson
S. Du
32
7
0
29 May 2024
X-VILA: Cross-Modality Alignment for Large Language Model
X-VILA: Cross-Modality Alignment for Large Language Model
Hanrong Ye
De-An Huang
Yao Lu
Zhiding Yu
Wei Ping
...
Jan Kautz
Song Han
Dan Xu
Pavlo Molchanov
Hongxu Yin
MLLM
VLM
40
29
0
29 May 2024
Multi-Modal Generative Embedding Model
Multi-Modal Generative Embedding Model
Feipeng Ma
Hongwei Xue
Guangting Wang
Yizhou Zhou
Fengyun Rao
Shilin Yan
Yueyi Zhang
Siying Wu
Mike Zheng Shou
Xiaoyan Sun
VLM
26
3
0
29 May 2024
Does Diffusion Beat GAN in Image Super Resolution?
Does Diffusion Beat GAN in Image Super Resolution?
Denis Kuznedelev
Valerii Startsev
Daniil Shlenskii
Sergey Kastryulin
36
4
0
27 May 2024
Synergy and Diversity in CLIP: Enhancing Performance Through Adaptive Backbone Ensembling
Synergy and Diversity in CLIP: Enhancing Performance Through Adaptive Backbone Ensembling
Cristian Rodriguez-Opazo
Ehsan Abbasnejad
Damien Teney
Edison Marrese-Taylor
Hamed Damirchi
A. Hengel
VLM
35
1
0
27 May 2024
ID-to-3D: Expressive ID-guided 3D Heads via Score Distillation Sampling
ID-to-3D: Expressive ID-guided 3D Heads via Score Distillation Sampling
F. Babiloni
Alexandros Lattas
Jiankang Deng
S. Zafeiriou
DiffM
33
4
0
26 May 2024
Composed Image Retrieval for Remote Sensing
Composed Image Retrieval for Remote Sensing
Bill Psomas
Ioannis Kakogeorgiou
Nikos Efthymiadis
Giorgos Tolias
Ondřej Chum
Yannis Avrithis
Konstantinos Karantzalos
41
4
0
24 May 2024
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision
  Models
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models
Byung-Kwan Lee
Chae Won Kim
Beomchan Park
Yonghyun Ro
MLLM
LRM
25
17
0
24 May 2024
FreezeAsGuard: Mitigating Illegal Adaptation of Diffusion Models via
  Selective Tensor Freezing
FreezeAsGuard: Mitigating Illegal Adaptation of Diffusion Models via Selective Tensor Freezing
Kai Huang
Wei Gao
32
2
0
24 May 2024
Unlearning Concepts in Diffusion Model via Concept Domain Correction and Concept Preserving Gradient
Unlearning Concepts in Diffusion Model via Concept Domain Correction and Concept Preserving Gradient
Yongliang Wu
Shiji Zhou
Mingzhuo Yang
Lianzhe Wang
Wenbo Zhu
Heng Chang
Xiao Zhou
Xu Yang
Xu Yang
53
18
0
24 May 2024
DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception
DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception
Run Luo
Yunshui Li
Longze Chen
Wanwei He
Ting-En Lin
...
Zikai Song
Xiaobo Xia
Tongliang Liu
Min Yang
Binyuan Hui
VLM
DiffM
70
15
0
24 May 2024
A Textbook Remedy for Domain Shifts: Knowledge Priors for Medical Image
  Analysis
A Textbook Remedy for Domain Shifts: Knowledge Priors for Medical Image Analysis
Yue Yang
Mona Gandhi
Yufei Wang
Yifan Wu
Michael S. Yao
Christopher Callison-Burch
James C. Gee
Mark Yatskar
48
3
0
23 May 2024
Single Image Unlearning: Efficient Machine Unlearning in Multimodal Large Language Models
Single Image Unlearning: Efficient Machine Unlearning in Multimodal Large Language Models
Jiaqi Li
Qianshan Wei
Chuanyi Zhang
Guilin Qi
Miaozeng Du
Yongrui Chen
Sheng Bi
Fan Liu
VLM
MU
67
12
0
21 May 2024
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Chameleon Team
MLLM
60
254
0
16 May 2024
PeRFlow: Piecewise Rectified Flow as Universal Plug-and-Play Accelerator
PeRFlow: Piecewise Rectified Flow as Universal Plug-and-Play Accelerator
Hanshu Yan
Xingchao Liu
Jiachun Pan
Jun Hao Liew
Qiang Liu
Jiashi Feng
32
40
0
13 May 2024
Fractals as Pre-training Datasets for Anomaly Detection and Localization
Fractals as Pre-training Datasets for Anomaly Detection and Localization
C. Ugwu
S. Casarin
O. Lanz
24
0
0
11 May 2024
Training-free Subject-Enhanced Attention Guidance for Compositional
  Text-to-image Generation
Training-free Subject-Enhanced Attention Guidance for Compositional Text-to-image Generation
Shengyuan Liu
Bo Wang
Ye Ma
Te Yang
Xipeng Cao
Quan Chen
Han Li
Di Dong
Peng Jiang
EGVM
36
2
0
11 May 2024
A Survey on Personalized Content Synthesis with Diffusion Models
A Survey on Personalized Content Synthesis with Diffusion Models
Xu-Lu Zhang
Xiao Wei
Wengyu Zhang
Jinlin Wu
Zhaoxiang Zhang
Zhen Lei
Qing Li
Zhen Lei
Qing Li
EGVM
133
19
0
09 May 2024
MVIP-NeRF: Multi-view 3D Inpainting on NeRF Scenes via Diffusion Prior
MVIP-NeRF: Multi-view 3D Inpainting on NeRF Scenes via Diffusion Prior
Honghua Chen
Chen Change Loy
Xingang Pan
39
11
0
05 May 2024
Understanding Retrieval-Augmented Task Adaptation for Vision-Language
  Models
Understanding Retrieval-Augmented Task Adaptation for Vision-Language Models
Yifei Ming
Yixuan Li
VLM
23
7
0
02 May 2024
List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs
List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs
An Yan
Zhengyuan Yang
Junda Wu
Wanrong Zhu
Jianwei Yang
...
K. Lin
Jianfeng Wang
Julian McAuley
Jianfeng Gao
Lijuan Wang
LRM
34
12
0
25 Apr 2024
CharacterFactory: Sampling Consistent Characters with GANs for Diffusion
  Models
CharacterFactory: Sampling Consistent Characters with GANs for Diffusion Models
Qinghe Wang
Baolu Li
Xiaomin Li
Bing Cao
Liqian Ma
Huchuan Lu
Xu Jia
DiffM
40
6
0
24 Apr 2024
SkinGEN: an Explainable Dermatology Diagnosis-to-Generation Framework with Interactive Vision-Language Models
SkinGEN: an Explainable Dermatology Diagnosis-to-Generation Framework with Interactive Vision-Language Models
Bo Lin
Yingjing Xu
Xuanwen Bao
Zhou Zhao
Zuyong Zhang
Zhouyang Wang
57
2
0
23 Apr 2024
MultiBooth: Towards Generating All Your Concepts in an Image from Text
MultiBooth: Towards Generating All Your Concepts in an Image from Text
Chenyang Zhu
Kai Li
Yue Ma
Chunming He
Li Xiu
DiffM
104
22
0
22 Apr 2024
Iteratively Prompting Multimodal LLMs to Reproduce Natural and
  AI-Generated Images
Iteratively Prompting Multimodal LLMs to Reproduce Natural and AI-Generated Images
Ali Naseh
Katherine Thai
Mohit Iyyer
Amir Houmansadr
33
5
0
21 Apr 2024
Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image
  Synthesis
Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis
Yuxi Ren
Xin Xia
Yanzuo Lu
Jiacheng Zhang
Jie Wu
Pan Xie
Xing Wang
Xuefeng Xiao
34
61
0
21 Apr 2024
Pixel is a Barrier: Diffusion Models Are More Adversarially Robust Than
  We Think
Pixel is a Barrier: Diffusion Models Are More Adversarially Robust Than We Think
Haotian Xue
Yongxin Chen
DiffM
AAML
39
3
0
20 Apr 2024
Adaptive Memory Replay for Continual Learning
Adaptive Memory Replay for Continual Learning
James Seale Smith
Lazar Valkov
Shaunak Halbe
V. Gutta
Rogerio Feris
Z. Kira
Leonid Karlinsky
31
6
0
18 Apr 2024
MMInA: Benchmarking Multihop Multimodal Internet Agents
MMInA: Benchmarking Multihop Multimodal Internet Agents
Ziniu Zhang
Shulin Tian
Liangyu Chen
Ziwei Liu
LLMAG
LM&Ro
27
13
0
15 Apr 2024
MaxFusion: Plug&Play Multi-Modal Generation in Text-to-Image Diffusion
  Models
MaxFusion: Plug&Play Multi-Modal Generation in Text-to-Image Diffusion Models
Nithin Gopalakrishnan Nair
Jeya Maria Jose Valanarasu
Vishal M. Patel
MoMe
33
7
0
15 Apr 2024
Knowledge-enhanced Visual-Language Pretraining for Computational
  Pathology
Knowledge-enhanced Visual-Language Pretraining for Computational Pathology
Xiao Zhou
Xiaoman Zhang
Chaoyi Wu
Ya-Qin Zhang
Weidi Xie
Yanfeng Wang
VLM
27
6
0
15 Apr 2024
Previous
123...567...91011
Next