Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2112.03562
Cited By
CMA-CLIP: Cross-Modality Attention CLIP for Image-Text Classification
7 December 2021
Huidong Liu
Shaoyuan Xu
Jinmiao Fu
Yang Liu
Ning Xie
Chien Wang
Bryan Wang
Yi Sun
CLIP
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CMA-CLIP: Cross-Modality Attention CLIP for Image-Text Classification"
10 / 10 papers shown
Title
TeG-DG: Textually Guided Domain Generalization for Face Anti-Spoofing
Lianrui Mu
Jianhong Bai
Xiaoxuan He
Jiangnan Ye
Xiaoyu Liang
Yuchen Yang
Jiedong Zhuang
Haoji Hu
22
2
0
30 Nov 2023
Large Scale Generative Multimodal Attribute Extraction for E-commerce Attributes
Anant Khandelwal
Happy Mittal
S. Kulkarni
D. Gupta
17
9
0
01 Jun 2023
Towards ethical multimodal systems
Alexis Roger
Esma Aïmeur
Irina Rish
24
3
0
26 Apr 2023
Text-to-Image Diffusion Models are Zero-Shot Classifiers
Kevin Clark
P. Jaini
DiffM
VLM
22
106
0
27 Mar 2023
Contrastive language and vision learning of general fashion concepts
P. Chia
Giuseppe Attanasio
Federico Bianchi
Silvia Terragni
A. Magalhães
Diogo Gonçalves
C. Greco
Jacopo Tagliabue
CLIP
13
42
0
08 Apr 2022
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
3,689
0
11 Feb 2021
Improved Baselines with Momentum Contrastive Learning
Xinlei Chen
Haoqi Fan
Ross B. Girshick
Kaiming He
SSL
238
3,367
0
09 Mar 2020
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
250
926
0
24 Sep 2019
Supervised Multimodal Bitransformers for Classifying Images and Text
Douwe Kiela
Suvrat Bhooshan
Hamed Firooz
Ethan Perez
Davide Testuggine
57
241
0
06 Sep 2019
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Z. Tu
Kaiming He
268
10,214
0
16 Nov 2016
1