Seeing past words: Testing the cross-modal capabilities of pretrained V&L models on counting tasks

22 December 2020

Papers citing "Seeing past words: Testing the cross-modal capabilities of pretrained V&L models on counting tasks"

6 / 6 papers shown

Title
What Do VLMs NOTICE? A Mechanistic Interpretability Pipeline for Gaussian-Noise-free Text-Image Corruption and Evaluation Michal Golovanevsky William Rudman Vedant Palit Ritambhara Singh Carsten Eickhoff 31 1 0 24 Jun 2024
Generative Models as a Complex Systems Science: How can we make sense of large language model behavior? Ari Holtzman Peter West Luke Zettlemoyer AI4CE 30 13 0 31 Jul 2023
Controlling for Stereotypes in Multimodal Language Model Evaluation Manuj Malik Richard Johansson 18 1 0 03 Feb 2023
Finding Structural Knowledge in Multimodal-BERT Victor Milewski Miryam de Lhoneux Marie-Francine Moens 19 9 0 17 Mar 2022
Recent Advances of Continual Learning in Computer Vision: An Overview Haoxuan Qu Hossein Rahmani Li Xu Bryan M. Williams Jun Liu VLM CLL 23 73 0 23 Sep 2021
Vision-and-Language or Vision-for-Language? On Cross-Modal Influence in Multimodal Transformers Stella Frank Emanuele Bugliarello Desmond Elliott 30 81 0 09 Sep 2021