ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2207.02159
  4. Cited By
Robustness Analysis of Video-Language Models Against Visual and Language
  Perturbations

Robustness Analysis of Video-Language Models Against Visual and Language Perturbations

5 July 2022
Madeline Chantry Schiappa
Shruti Vyas
Hamid Palangi
Y. S. Rawat
Vibhav Vineet
    VLM
ArXivPDFHTML

Papers citing "Robustness Analysis of Video-Language Models Against Visual and Language Perturbations"

8 / 8 papers shown
Title
Robust Fairness Vision-Language Learning for Medical Image Analysis
Robust Fairness Vision-Language Learning for Medical Image Analysis
Sparsh Bansal
Mingyang Wu
Xin Wang
S. Hu
VLM
19
0
0
06 May 2025
Measure and Improve Robustness in NLP Models: A Survey
Measure and Improve Robustness in NLP Models: A Survey
Xuezhi Wang
Haohan Wang
Diyi Yang
103
130
0
15 Dec 2021
NL-Augmenter: A Framework for Task-Sensitive Natural Language
  Augmentation
NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation
Kaustubh D. Dhole
Varun Gangal
Sebastian Gehrmann
Aadesh Gupta
Zhenhao Li
...
Tianbao Xie
Usama Yaseen
Michael A. Yee
Jing Zhang
Yue Zhang
135
80
0
06 Dec 2021
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text
  Understanding
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Hu Xu
Gargi Ghosh
Po-Yao (Bernie) Huang
Dmytro Okhonko
Armen Aghajanyan
Florian Metze
Luke Zettlemoyer
Florian Metze Luke Zettlemoyer Christoph Feichtenhofer
CLIP
VLM
237
444
0
28 Sep 2021
VATT: Transformers for Multimodal Self-Supervised Learning from Raw
  Video, Audio and Text
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Hassan Akbari
Liangzhe Yuan
Rui Qian
Wei-Hong Chuang
Shih-Fu Chang
Yin Cui
Boqing Gong
ViT
226
499
0
22 Apr 2021
Is Space-Time Attention All You Need for Video Understanding?
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
270
1,939
0
09 Feb 2021
Learning to Recognize Dialect Features
Learning to Recognize Dialect Features
Dorottya Demszky
D. Sharma
J. Clark
Vinodkumar Prabhakaran
Jacob Eisenstein
73
34
0
23 Oct 2020
Efficient Estimation of Word Representations in Vector Space
Efficient Estimation of Word Representations in Vector Space
Tomáš Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
226
29,632
0
16 Jan 2013
1