ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.04183
  4. Cited By
MAMO: Masked Multimodal Modeling for Fine-Grained Vision-Language
  Representation Learning

MAMO: Masked Multimodal Modeling for Fine-Grained Vision-Language Representation Learning

9 October 2022
Zijia Zhao
Longteng Guo
Xingjian He
Shuai Shao
Zehuan Yuan
Jing Liu
ArXivPDFHTML

Papers citing "MAMO: Masked Multimodal Modeling for Fine-Grained Vision-Language Representation Learning"

12 / 12 papers shown
Title
Enhancing Image Resolution of Solar Magnetograms: A Latent Diffusion Model Approach
Enhancing Image Resolution of Solar Magnetograms: A Latent Diffusion Model Approach
Francesco P. Ramunno
Paolo Massa
Vitaliy Kinakh
Brandon Panos
A. Csillaghy
S. Voloshynovskiy
DiffM
53
0
0
31 Mar 2025
SyncMask: Synchronized Attentional Masking for Fashion-centric
  Vision-Language Pretraining
SyncMask: Synchronized Attentional Masking for Fashion-centric Vision-Language Pretraining
Chull Hwan Song
Taebaek Hwang
Jooyoung Yoon
Shunghyun Choi
Yeong Hyeon Gu
21
4
0
01 Apr 2024
Enhancing EEG-to-Text Decoding through Transferable Representations from
  Pre-trained Contrastive EEG-Text Masked Autoencoder
Enhancing EEG-to-Text Decoding through Transferable Representations from Pre-trained Contrastive EEG-Text Masked Autoencoder
Jiaqi Wang
Zhenxi Song
Zhengyu Ma
Xipeng Qiu
Min Zhang
Zhiguo Zhang
29
5
0
27 Feb 2024
CFIR: Fast and Effective Long-Text To Image Retrieval for Large Corpora
CFIR: Fast and Effective Long-Text To Image Retrieval for Large Corpora
Zijun Long
Xuri Ge
R. McCreadie
Joemon M. Jose
17
5
0
23 Feb 2024
LitLLM: A Toolkit for Scientific Literature Review
LitLLM: A Toolkit for Scientific Literature Review
Shubham Agarwal
I. Laradji
Laurent Charlin
Christopher Pal
Krishnamurthy DJ Dvijotham
Jason Stanley
Laurent Charlin
Christopher Pal
33
20
0
02 Feb 2024
EVE: Efficient Vision-Language Pre-training with Masked Prediction and
  Modality-Aware MoE
EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE
Junyi Chen
Longteng Guo
Jianxiang Sun
Shuai Shao
Zehuan Yuan
Liang Lin
Dongyu Zhang
MLLM
VLM
MoE
43
9
0
23 Aug 2023
Training Vision-Language Transformers from Captions
Training Vision-Language Transformers from Captions
Liangke Gui
Yingshan Chang
Qiuyuan Huang
Subhojit Som
Alexander G. Hauptmann
Jianfeng Gao
Yonatan Bisk
VLM
ViT
172
11
0
19 May 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,412
0
11 Nov 2021
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
295
5,761
0
29 Apr 2021
Zero-Shot Text-to-Image Generation
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,764
0
24 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
3,683
0
11 Feb 2021
Unifying Vision-and-Language Tasks via Text Generation
Unifying Vision-and-Language Tasks via Text Generation
Jaemin Cho
Jie Lei
Hao Tan
Mohit Bansal
MLLM
249
525
0
04 Feb 2021
1