Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2301.02280
Cited By
Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training
5 January 2023
Filip Radenovic
Abhimanyu Dubey
Abhishek Kadian
Todor Mihaylov
Simon Vandenhende
Yash J. Patel
Y. Wen
Vignesh Ramanathan
D. Mahajan
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training"
11 / 11 papers shown
Title
MedCLIP-SAMv2: Towards Universal Text-Driven Medical Image Segmentation
Taha Koleilat
Hojat Asgariandehkordi
H. Rivaz
Yiming Xiao
MedIm
VLM
36
6
0
28 Sep 2024
Graph-Based Captioning: Enhancing Visual Descriptions by Interconnecting Region Captions
Yu-Guan Hsieh
Cheng-Yu Hsieh
Shih-Ying Yeh
Louis Béthune
Hadi Pour Ansari
Pavan Kumar Anasosalu Vasu
Chun-Liang Li
Ranjay Krishna
Oncel Tuzel
Marco Cuturi
58
4
0
09 Jul 2024
A Cross-Dataset Study for Text-based 3D Human Motion Retrieval
Léore Bensabath
Mathis Petrovich
Gül Varol
33
2
0
27 May 2024
Effective pruning of web-scale datasets based on complexity of concept clusters
Amro Abbas
E. Rusak
Kushal Tirumala
Wieland Brendel
Kamalika Chaudhuri
Ari S. Morcos
VLM
CLIP
13
22
0
09 Jan 2024
Mitigating Open-Vocabulary Caption Hallucinations
Assaf Ben-Kish
Moran Yanuka
Morris Alper
Raja Giryes
Hadar Averbuch-Elor
MLLM
VLM
11
6
0
06 Dec 2023
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
Bin Xiao
Haiping Wu
Weijian Xu
Xiyang Dai
Houdong Hu
Yumao Lu
Michael Zeng
Ce Liu
Lu Yuan
VLM
31
140
0
10 Nov 2023
AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model
Avamarie Brueggeman
Andrea Madotto
Zhaojiang Lin
Tushar Nagarajan
Matt Smith
...
Peyman Heidari
Yue Liu
Kavya Srinet
Babak Damavandi
Anuj Kumar
MLLM
24
92
0
27 Sep 2023
COLA: A Benchmark for Compositional Text-to-image Retrieval
Arijit Ray
Filip Radenovic
Abhimanyu Dubey
Bryan A. Plummer
Ranjay Krishna
Kate Saenko
CoGe
VLM
28
34
0
05 May 2023
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
385
4,010
0
28 Jan 2022
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
3,683
0
11 Feb 2021
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
279
39,083
0
01 Sep 2014
1