Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2401.16575
Cited By
Beyond Image-Text Matching: Verb Understanding in Multimodal Transformers Using Guided Masking
29 January 2024
Ivana Beňová
Jana Kosecka
Michal Gregor
Martin Tamajka
Marcel Veselý
Marián Simko
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Beyond Image-Text Matching: Verb Understanding in Multimodal Transformers Using Guided Masking"
2 / 2 papers shown
Title
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
382
4,010
0
28 Jan 2022
Trankit: A Light-Weight Transformer-based Toolkit for Multilingual Natural Language Processing
Minh Nguyen
Viet Dac Lai
Amir Pouran Ben Veyseh
Thien Huu Nguyen
44
115
0
09 Jan 2021
1