ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.17510
  4. Cited By
Demonstrating and Reducing Shortcuts in Vision-Language Representation
  Learning

Demonstrating and Reducing Shortcuts in Vision-Language Representation Learning

27 February 2024
Maurits J. R. Bleeker
Mariya Hendriksen
Andrew Yates
Maarten de Rijke
    VLM
ArXivPDFHTML

Papers citing "Demonstrating and Reducing Shortcuts in Vision-Language Representation Learning"

9 / 9 papers shown
Title
Probing Mechanical Reasoning in Large Vision Language Models
Probing Mechanical Reasoning in Large Vision Language Models
Haoran Sun
Qingying Gao
Haiyun Lyu
Dezhi Luo
Yijiang Li
Hokin Deng
LRM
33
2
0
01 Oct 2024
Assessing Brittleness of Image-Text Retrieval Benchmarks from
  Vision-Language Models Perspective
Assessing Brittleness of Image-Text Retrieval Benchmarks from Vision-Language Models Perspective
Mariya Hendriksen
Shuo Zhang
R. Reinanda
Mohamed Yahya
Edgar Meij
Maarten de Rijke
38
0
0
21 Jul 2024
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
385
4,010
0
28 Jan 2022
A Fine-Grained Analysis on Distribution Shift
A Fine-Grained Analysis on Distribution Shift
Olivia Wiles
Sven Gowal
Florian Stimberg
Sylvestre-Alvise Rebuffi
Ira Ktena
Krishnamurthy Dvijotham
A. Cemgil
OOD
215
196
0
21 Oct 2021
Which Shortcut Cues Will DNNs Choose? A Study from the Parameter-Space
  Perspective
Which Shortcut Cues Will DNNs Choose? A Study from the Parameter-Space Perspective
Luca Scimeca
Seong Joon Oh
Sanghyuk Chun
Michael Poli
Sangdoo Yun
OOD
374
49
0
06 Oct 2021
Is An Image Worth Five Sentences? A New Look into Semantics for
  Image-Text Matching
Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching
Ali Furkan Biten
Andrés Mafla
Lluís Gómez
Dimosthenis Karatzas
166
17
0
06 Oct 2021
Compressive Visual Representations
Compressive Visual Representations
Kuang-Huei Lee
Anurag Arnab
S. Guadarrama
John F. Canny
Ian S. Fischer
SSL
54
48
0
27 Sep 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
3,683
0
11 Feb 2021
1