ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2208.05516
  4. Cited By
Quality Not Quantity: On the Interaction between Dataset Design and
  Robustness of CLIP

Quality Not Quantity: On the Interaction between Dataset Design and Robustness of CLIP

10 August 2022
Thao Nguyen
Gabriel Ilharco
Mitchell Wortsman
Sewoong Oh
Ludwig Schmidt
    CLIP
    VLM
ArXivPDFHTML

Papers citing "Quality Not Quantity: On the Interaction between Dataset Design and Robustness of CLIP"

50 / 86 papers shown
Title
Efficient Self-Supervised Learning for Earth Observation via Dynamic Dataset Curation
Efficient Self-Supervised Learning for Earth Observation via Dynamic Dataset Curation
Thomas Kerdreux
A. Tuel
Quentin Febvre
A. Mouche
Bertrand Chapron
73
0
0
09 Apr 2025
Advancing Medical Representation Learning Through High-Quality Data
Advancing Medical Representation Learning Through High-Quality Data
Negin Baghbanzadeh
Adibvafa Fallahpour
Yasaman Parhizkar
Franklin Ogidi
Shuvendu Roy
...
Vahid Reza Khazaie
Michael Colacci
Ali Etemad
Arash Afkanpour
Elham Dolatabadi
LM&MA
83
0
0
18 Mar 2025
ProKeR: A Kernel Perspective on Few-Shot Adaptation of Large Vision-Language Models
ProKeR: A Kernel Perspective on Few-Shot Adaptation of Large Vision-Language Models
Yassir Bendou
Amine Ouasfi
Vincent Gripon
A. Boukhayma
VLM
51
0
0
19 Jan 2025
The Unmet Promise of Synthetic Training Images: Using Retrieved Real Images Performs Better
The Unmet Promise of Synthetic Training Images: Using Retrieved Real Images Performs Better
Scott Geng
Cheng-Yu Hsieh
Vivek Ramanujan
Matthew Wallingford
Chun-Liang Li
Pang Wei Koh
Ranjay Krishna
DiffM
60
6
0
03 Jan 2025
A Review of Multimodal Explainable Artificial Intelligence: Past,
  Present and Future
A Review of Multimodal Explainable Artificial Intelligence: Past, Present and Future
Shilin Sun
Wenbin An
Feng Tian
Fang Nan
Qidong Liu
J. Liu
N. Shah
Ping Chen
83
2
0
18 Dec 2024
Identifying Implicit Social Biases in Vision-Language Models
Identifying Implicit Social Biases in Vision-Language Models
Kimia Hamidieh
Haoran Zhang
Walter Gerych
Thomas Hartvigsen
Marzyeh Ghassemi
VLM
28
11
0
01 Nov 2024
SELECT: A Large-Scale Benchmark of Data Curation Strategies for Image
  Classification
SELECT: A Large-Scale Benchmark of Data Curation Strategies for Image Classification
Benjamin Feuer
Jiawei Xu
Niv Cohen
Patrick Yubeaton
Govind Mittal
Chinmay Hegde
18
1
0
07 Oct 2024
Toward a Holistic Evaluation of Robustness in CLIP Models
Toward a Holistic Evaluation of Robustness in CLIP Models
Weijie Tu
Weijian Deng
Tom Gedeon
VLM
34
5
0
02 Oct 2024
Unsupervised Domain Adaptation Via Data Pruning
Unsupervised Domain Adaptation Via Data Pruning
Andrea Napoli
Paul White
24
1
0
18 Sep 2024
The Data Addition Dilemma
The Data Addition Dilemma
Judy Hanwen Shen
Inioluwa Deborah Raji
Irene Y. Chen
30
5
0
08 Aug 2024
Scaling Sign Language Translation
Scaling Sign Language Translation
Biao Zhang
Garrett Tanzer
Orhan Firat
LRM
VLM
SLR
32
1
0
16 Jul 2024
Towards Adversarially Robust Vision-Language Models: Insights from
  Design Choices and Prompt Formatting Techniques
Towards Adversarially Robust Vision-Language Models: Insights from Design Choices and Prompt Formatting Techniques
Rishika Bhagwatkar
Shravan Nayak
Reza Bayat
Alexis Roger
Daniel Z Kaplan
P. Bashivan
Irina Rish
AAML
VLM
34
1
0
15 Jul 2024
Textual Query-Driven Mask Transformer for Domain Generalized
  Segmentation
Textual Query-Driven Mask Transformer for Domain Generalized Segmentation
Byeonghyun Pak
Byeongju Woo
Sunghwan Kim
Dae-Hwan Kim
Hoseong Kim
37
3
0
12 Jul 2024
The Synergy between Data and Multi-Modal Large Language Models: A Survey
  from Co-Development Perspective
The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective
Zhen Qin
Daoyuan Chen
Wenhao Zhang
Liuyi Yao
Yilun Huang
Bolin Ding
Yaliang Li
Shuiguang Deng
48
5
0
11 Jul 2024
LEMoN: Label Error Detection using Multimodal Neighbors
LEMoN: Label Error Detection using Multimodal Neighbors
Haoran Zhang
Aparna Balagopalan
Nassim Oufattole
Hyewon Jeong
Yan Wu
Jiacheng Zhu
Marzyeh Ghassemi
42
0
0
10 Jul 2024
Deciphering the Role of Representation Disentanglement: Investigating
  Compositional Generalization in CLIP Models
Deciphering the Role of Representation Disentanglement: Investigating Compositional Generalization in CLIP Models
Reza Abbasi
M. Rohban
M. Baghshah
CoGe
38
5
0
08 Jul 2024
Data-Centric AI in the Age of Large Language Models
Data-Centric AI in the Age of Large Language Models
Xinyi Xu
Zhaoxuan Wu
Rui Qiao
Arun Verma
Yao Shu
...
Xiaoqiang Lin
Wenyang Hu
Zhongxiang Dai
Pang Wei Koh
Bryan Kian Hsiang Low
ALM
40
2
0
20 Jun 2024
Generalization Beyond Data Imbalance: A Controlled Study on CLIP for
  Transferable Insights
Generalization Beyond Data Imbalance: A Controlled Study on CLIP for Transferable Insights
Xin Wen
Bingchen Zhao
Yilun Chen
Jiangmiao Pang
Xiaojuan Qi
30
3
0
31 May 2024
CLIPLoss and Norm-Based Data Selection Methods for Multimodal
  Contrastive Learning
CLIPLoss and Norm-Based Data Selection Methods for Multimodal Contrastive Learning
Yiping Wang
Yifang Chen
Wendan Yan
Alex Fang
Wenjing Zhou
Kevin G. Jamieson
S. Du
32
7
0
29 May 2024
Multilingual Diversity Improves Vision-Language Representations
Multilingual Diversity Improves Vision-Language Representations
Thao Nguyen
Matthew Wallingford
Sebastin Santy
Wei-Chiu Ma
Sewoong Oh
Ludwig Schmidt
Pang Wei Koh
Ranjay Krishna
VLM
29
5
0
27 May 2024
Benchmarking and Improving Bird's Eye View Perception Robustness in Autonomous Driving
Benchmarking and Improving Bird's Eye View Perception Robustness in Autonomous Driving
Shaoyuan Xie
Lingdong Kong
Wenwei Zhang
Jiawei Ren
Liang Pan
Kai-xiang Chen
Ziwei Liu
AAML
50
9
0
27 May 2024
FFF: Fixing Flawed Foundations in contrastive pre-training results in
  very strong Vision-Language models
FFF: Fixing Flawed Foundations in contrastive pre-training results in very strong Vision-Language models
Adrian Bulat
Yassine Ouali
Georgios Tzimiropoulos
VLM
35
4
0
16 May 2024
Who's in and who's out? A case study of multimodal CLIP-filtering in
  DataComp
Who's in and who's out? A case study of multimodal CLIP-filtering in DataComp
Rachel Hong
William Agnew
Tadayoshi Kohno
Jamie Morgenstern
27
9
0
13 May 2024
HYPE: Hyperbolic Entailment Filtering for Underspecified Images and
  Texts
HYPE: Hyperbolic Entailment Filtering for Underspecified Images and Texts
Wonjae Kim
Sanghyuk Chun
Taekyung Kim
Dongyoon Han
Sangdoo Yun
39
7
0
26 Apr 2024
AMU-Tuning: Effective Logit Bias for CLIP-based Few-shot Learning
AMU-Tuning: Effective Logit Bias for CLIP-based Few-shot Learning
Yuwei Tang
Zhenyi Lin
Qilong Wang
Pengfei Zhu
Qinghua Hu
26
11
0
13 Apr 2024
Scaling (Down) CLIP: A Comprehensive Analysis of Data, Architecture, and
  Training Strategies
Scaling (Down) CLIP: A Comprehensive Analysis of Data, Architecture, and Training Strategies
Zichao Li
Cihang Xie
E. D. Cubuk
CLIP
32
8
0
12 Apr 2024
Two Effects, One Trigger: On the Modality Gap, Object Bias, and Information Imbalance in Contrastive Vision-Language Models
Two Effects, One Trigger: On the Modality Gap, Object Bias, and Information Imbalance in Contrastive Vision-Language Models
Simon Schrodi
David T. Hoffmann
Max Argus
Volker Fischer
Thomas Brox
VLM
50
0
0
11 Apr 2024
No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency
  Determines Multimodal Model Performance
No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance
Vishaal Udandarao
Ameya Prabhu
Adhiraj Ghosh
Yash Sharma
Philip H. S. Torr
Adel Bibi
Samuel Albanie
Matthias Bethge
VLM
118
44
0
04 Apr 2024
Bridging Remote Sensors with Multisensor Geospatial Foundation Models
Bridging Remote Sensors with Multisensor Geospatial Foundation Models
Boran Han
Shuai Zhang
Xingjian Shi
Markus Reichstein
24
22
0
01 Apr 2024
A Decade's Battle on Dataset Bias: Are We There Yet?
A Decade's Battle on Dataset Bias: Are We There Yet?
Zhuang Liu
Kaiming He
32
26
0
13 Mar 2024
Finetuned Multimodal Language Models Are High-Quality Image-Text Data
  Filters
Finetuned Multimodal Language Models Are High-Quality Image-Text Data Filters
Weizhi Wang
Khalil Mrini
Linjie Yang
Sateesh Kumar
Yu Tian
Xifeng Yan
Heng Wang
32
16
0
05 Mar 2024
An Empirical Study Into What Matters for Calibrating Vision-Language
  Models
An Empirical Study Into What Matters for Calibrating Vision-Language Models
Weijie Tu
Weijian Deng
Dylan Campbell
Stephen Gould
Tom Gedeon
VLM
33
7
0
12 Feb 2024
A Closer Look at the Robustness of Contrastive Language-Image
  Pre-Training (CLIP)
A Closer Look at the Robustness of Contrastive Language-Image Pre-Training (CLIP)
Weijie Tu
Weijian Deng
Tom Gedeon
UQCV
VLM
20
32
0
12 Feb 2024
On Catastrophic Inheritance of Large Foundation Models
On Catastrophic Inheritance of Large Foundation Models
Hao Chen
Bhiksha Raj
Xing Xie
Jindong Wang
AI4CE
48
12
0
02 Feb 2024
Cross-modality debiasing: using language to mitigate sub-population
  shifts in imaging
Cross-modality debiasing: using language to mitigate sub-population shifts in imaging
Yijiang Pang
Hoang Bao
Jiayu Zhou
11
0
0
02 Feb 2024
A Survey of Reasoning with Foundation Models
A Survey of Reasoning with Foundation Models
Jiankai Sun
Chuanyang Zheng
E. Xie
Zhengying Liu
Ruihang Chu
...
Xipeng Qiu
Yi-Chen Guo
Hui Xiong
Qun Liu
Zhenguo Li
ReLM
LRM
AI4CE
22
75
0
17 Dec 2023
Robustness of Deep Learning for Accelerated MRI: Benefits of Diverse
  Training Data
Robustness of Deep Learning for Accelerated MRI: Benefits of Diverse Training Data
Kang Lin
Reinhard Heckel
OOD
27
5
0
16 Dec 2023
BioCLIP: A Vision Foundation Model for the Tree of Life
BioCLIP: A Vision Foundation Model for the Tree of Life
Samuel Stevens
Jiaman Wu
Matthew J Thompson
Elizabeth G Campolongo
Chan Hee Song
...
Wasila M Dahdul
Charles V. Stewart
Tanya Berger-Wolf
Wei-Lun Chao
Yu-Chuan Su
26
62
0
30 Nov 2023
MLLMs-Augmented Visual-Language Representation Learning
MLLMs-Augmented Visual-Language Representation Learning
Yanqing Liu
Kai Wang
Wenqi Shao
Ping Luo
Yu Qiao
Mike Zheng Shou
Kaipeng Zhang
Yang You
VLM
21
11
0
30 Nov 2023
Zero-shot Retrieval: Augmenting Pre-trained Models with Search Engines
Zero-shot Retrieval: Augmenting Pre-trained Models with Search Engines
Hamed Damirchi
Cristian Rodriguez-Opazo
Ehsan Abbasnejad
Damien Teney
Javen Qinfeng Shi
Stephen Gould
A. Hengel
VLM
27
0
0
29 Nov 2023
Data Similarity is Not Enough to Explain Language Model Performance
Data Similarity is Not Enough to Explain Language Model Performance
Gregory Yauney
Emily Reif
David M. Mimno
43
6
0
15 Nov 2023
Exploring Dataset-Scale Indicators of Data Quality
Exploring Dataset-Scale Indicators of Data Quality
Ben Feuer
Chinmay Hegde
16
1
0
07 Nov 2023
Does CLIP's Generalization Performance Mainly Stem from High Train-Test
  Similarity?
Does CLIP's Generalization Performance Mainly Stem from High Train-Test Similarity?
Prasanna Mayilvahanan
Thaddäus Wiedemer
E. Rusak
Matthias Bethge
Wieland Brendel
OODD
35
22
0
14 Oct 2023
Visual Data-Type Understanding does not emerge from Scaling
  Vision-Language Models
Visual Data-Type Understanding does not emerge from Scaling Vision-Language Models
Vishaal Udandarao
Max F. Burg
Samuel Albanie
Matthias Bethge
VLM
24
8
0
12 Oct 2023
Prompt Backdoors in Visual Prompt Learning
Prompt Backdoors in Visual Prompt Learning
Hai Huang
Zhengyu Zhao
Michael Backes
Yun Shen
Yang Zhang
VLM
VPVLM
AAML
SILM
35
2
0
11 Oct 2023
Propagating Semantic Labels in Video Data
Propagating Semantic Labels in Video Data
David Balaban
Justin Medich
Pranay Gosar
Justin W. Hart
VLM
33
1
0
01 Oct 2023
Data Filtering Networks
Data Filtering Networks
Alex Fang
Albin Madappally Jose
Amit Jain
Ludwig Schmidt
Alexander Toshev
Vaishaal Shankar
CLIP
23
124
0
29 Sep 2023
Understanding and Mitigating the Label Noise in Pre-training on
  Downstream Tasks
Understanding and Mitigating the Label Noise in Pre-training on Downstream Tasks
Hao Chen
Jindong Wang
Ankit Shah
Ran Tao
Hongxin Wei
Berfin cSimcsek
Masashi Sugiyama
Bhiksha Raj
22
26
0
29 Sep 2023
The Devil is in the Details: A Deep Dive into the Rabbit Hole of Data
  Filtering
The Devil is in the Details: A Deep Dive into the Rabbit Hole of Data Filtering
Hai-ping Yu
Yu Tian
Sateesh Kumar
Linjie Yang
Heng Wang
VLM
30
17
0
27 Sep 2023
Distributionally Robust Classification on a Data Budget
Distributionally Robust Classification on a Data Budget
Ben Feuer
Ameya Joshi
Minh Pham
C. Hegde
OOD
22
2
0
07 Aug 2023
12
Next