Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.04125
Cited By
No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance
4 April 2024
Vishaal Udandarao
Ameya Prabhu
Adhiraj Ghosh
Yash Sharma
Philip H. S. Torr
Adel Bibi
Samuel Albanie
Matthias Bethge
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance"
18 / 18 papers shown
Title
Reducing Annotation Burden in Physical Activity Research Using Vision-Language Models
Abram Schonfeldt
Benjamin Maylor
Xiaofang Chen
Ronald Clark
Aiden Doherty
58
0
0
06 May 2025
A Large-Scale Vision-Language Dataset Derived from Open Scientific Literature to Advance Biomedical Generalist AI
Alejandro Lozano
M. W. Sun
James Burgess
Jeffrey Nirschl
Christopher Polzak
...
Xiaohan Wang
Alfred Seunghoon Song
Chiang Chia-Chun
Robert Tibshirani
Serena Yeung-Levy
LM&MA
51
1
0
26 Mar 2025
Why Vision Language Models Struggle with Visual Arithmetic? Towards Enhanced Chart and Geometry Understanding
Kung-Hsiang Huang
Can Qin
Haoyi Qiu
Philippe Laban
Shafiq R. Joty
Caiming Xiong
C. Wu
VLM
61
1
0
17 Feb 2025
Audio-Language Datasets of Scenes and Events: A Survey
Gijs Wijngaard
Elia Formisano
Michele Esposito
M. Dumontier
61
2
0
10 Jan 2025
The Unmet Promise of Synthetic Training Images: Using Retrieved Real Images Performs Better
Scott Geng
Cheng-Yu Hsieh
Vivek Ramanujan
Matthew Wallingford
Chun-Liang Li
Pang Wei Koh
Ranjay Krishna
DiffM
29
6
0
03 Jan 2025
Estimating Causal Effects of Text Interventions Leveraging LLMs
Siyi Guo
Myrl G. Marmarelis
Fred Morstatter
Kristina Lerman
CML
36
0
0
28 Oct 2024
SECURE: Semantics-aware Embodied Conversation under Unawareness for Lifelong Robot Learning
Rimvydas Rubavicius
Peter David Fagan
A. Lascarides
Subramanian Ramamoorthy
LM&Ro
19
0
0
26 Sep 2024
SynthCLIP: Are We Ready for a Fully Synthetic CLIP Training?
Hasan Hammoud
Hani Itani
Fabio Pizzati
Philip H. S. Torr
Adel Bibi
Bernard Ghanem
CLIP
VLM
107
34
0
02 Feb 2024
From Categories to Classifier: Name-Only Continual Learning by Exploring the Web
Ameya Prabhu
Hasan Hammoud
Ser-Nam Lim
Bernard Ghanem
Philip H. S. Torr
Adel Bibi
CLL
94
9
0
19 Nov 2023
Holistic Evaluation of Text-To-Image Models
Tony Lee
Michihiro Yasunaga
Chenlin Meng
Yifan Mai
Joon Sung Park
...
Jun-Yan Zhu
Fei-Fei Li
Jiajun Wu
Stefano Ermon
Percy Liang
128
124
0
07 Nov 2023
Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4
Kent K. Chang
Mackenzie Cramer
Sandeep Soni
David Bamman
RALM
135
77
0
28 Apr 2023
Generating images of rare concepts using pre-trained diffusion models
Dvir Samuel
Rami Ben-Ari
Simon Raviv
N. Darshan
Gal Chechik
120
38
0
27 Apr 2023
Towards Foundation Models and Few-Shot Parameter-Efficient Fine-Tuning for Volumetric Organ Segmentation
Julio Silva-Rodríguez
Jose Dolz
Ismail Ben Ayed
39
12
0
29 Mar 2023
CyCLIP: Cyclic Contrastive Language-Image Pretraining
Shashank Goel
Hritik Bansal
S. Bhatia
Ryan A. Rossi
Vishwa Vinay
Aditya Grover
CLIP
VLM
151
131
0
28 May 2022
DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generation Models
Jaemin Cho
Abhaysinh Zala
Mohit Bansal
ViT
121
167
0
08 Feb 2022
Deduplicating Training Data Makes Language Models Better
Katherine Lee
Daphne Ippolito
A. Nystrom
Chiyuan Zhang
Douglas Eck
Chris Callison-Burch
Nicholas Carlini
SyDa
234
447
0
14 Jul 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,735
0
24 Feb 2021
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
273
845
0
17 Feb 2021
1