Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2209.09019
Cited By
LAVIS: A Library for Language-Vision Intelligence
15 September 2022
Dongxu Li
Junnan Li
Hung Le
Guangsen Wang
Silvio Savarese
S. Hoi
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LAVIS: A Library for Language-Vision Intelligence"
8 / 8 papers shown
Title
GOAT-Bench: Safety Insights to Large Multimodal Models through Meme-Based Social Abuse
Hongzhan Lin
Ziyang Luo
Bo Wang
Ruichao Yang
Jing Ma
8
22
0
03 Jan 2024
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
375
2,713
0
28 Jan 2022
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Hu Xu
Gargi Ghosh
Po-Yao (Bernie) Huang
Dmytro Okhonko
Armen Aghajanyan
Florian Metze
Luke Zettlemoyer
Florian Metze Luke Zettlemoyer Christoph Feichtenhofer
CLIP
VLM
237
444
0
28 Sep 2021
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
273
845
0
17 Feb 2021
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
270
1,486
0
09 Feb 2021
Unifying Vision-and-Language Tasks via Text Generation
Jaemin Cho
Jie Lei
Hao Tan
Mohit Bansal
MLLM
246
461
0
04 Feb 2021
VinVL: Revisiting Visual Representations in Vision-Language Models
Pengchuan Zhang
Xiujun Li
Xiaowei Hu
Jianwei Yang
Lei Zhang
Lijuan Wang
Yejin Choi
Jianfeng Gao
ObjD
VLM
249
147
0
02 Jan 2021
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
247
815
0
24 Sep 2019
1