ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2109.04448
  4. Cited By
Vision-and-Language or Vision-for-Language? On Cross-Modal Influence in
  Multimodal Transformers

Vision-and-Language or Vision-for-Language? On Cross-Modal Influence in Multimodal Transformers

9 September 2021
Stella Frank
Emanuele Bugliarello
Desmond Elliott
ArXivPDFHTML

Papers citing "Vision-and-Language or Vision-for-Language? On Cross-Modal Influence in Multimodal Transformers"

10 / 10 papers shown
Title
What Do VLMs NOTICE? A Mechanistic Interpretability Pipeline for Gaussian-Noise-free Text-Image Corruption and Evaluation
What Do VLMs NOTICE? A Mechanistic Interpretability Pipeline for Gaussian-Noise-free Text-Image Corruption and Evaluation
Michal Golovanevsky
William Rudman
Vedant Palit
Ritambhara Singh
Carsten Eickhoff
31
1
0
24 Jun 2024
Do Vision & Language Decoders use Images and Text equally? How Self-consistent are their Explanations?
Do Vision & Language Decoders use Images and Text equally? How Self-consistent are their Explanations?
Letitia Parcalabescu
Anette Frank
MLLM
CoGe
VLM
79
3
0
29 Apr 2024
Controlling for Stereotypes in Multimodal Language Model Evaluation
Controlling for Stereotypes in Multimodal Language Model Evaluation
Manuj Malik
Richard Johansson
18
1
0
03 Feb 2023
Multimodal Learning with Transformers: A Survey
Multimodal Learning with Transformers: A Survey
P. Xu
Xiatian Zhu
David A. Clifton
ViT
41
518
0
13 Jun 2022
Does a Technique for Building Multimodal Representation Matter? --
  Comparative Analysis
Does a Technique for Building Multimodal Representation Matter? -- Comparative Analysis
Maciej Pawłowski
Anna Wróblewska
Sylwia Sysko-Romañczuk
9
2
0
09 Jun 2022
Analyzing Modality Robustness in Multimodal Sentiment Analysis
Analyzing Modality Robustness in Multimodal Sentiment Analysis
Devamanyu Hazarika
Yingting Li
Bo Cheng
Shuai Zhao
Roger Zimmermann
Soujanya Poria
29
32
0
30 May 2022
Delving Deeper into Cross-lingual Visual Question Answering
Delving Deeper into Cross-lingual Visual Question Answering
Chen Cecilia Liu
Jonas Pfeiffer
Anna Korhonen
Ivan Vulić
Iryna Gurevych
13
8
0
15 Feb 2022
How Much Can CLIP Benefit Vision-and-Language Tasks?
How Much Can CLIP Benefit Vision-and-Language Tasks?
Sheng Shen
Liunian Harold Li
Hao Tan
Mohit Bansal
Anna Rohrbach
Kai-Wei Chang
Z. Yao
Kurt Keutzer
CLIP
VLM
MLLM
185
403
0
13 Jul 2021
Diagnosing Vision-and-Language Navigation: What Really Matters
Diagnosing Vision-and-Language Navigation: What Really Matters
Wanrong Zhu
Yuankai Qi
P. Narayana
Kazoo Sone
Sugato Basu
X. Wang
Qi Wu
M. Eckstein
W. Wang
LM&Ro
19
50
0
30 Mar 2021
Stanza: A Python Natural Language Processing Toolkit for Many Human
  Languages
Stanza: A Python Natural Language Processing Toolkit for Many Human Languages
Peng Qi
Yuhao Zhang
Yuhui Zhang
Jason Bolton
Christopher D. Manning
AI4TS
199
1,638
0
16 Mar 2020
1