ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.10436
  4. Cited By
Transformer-Based Multi-modal Proposal and Re-Rank for Wikipedia
  Image-Caption Matching

Transformer-Based Multi-modal Proposal and Re-Rank for Wikipedia Image-Caption Matching

21 June 2022
Nicola Messina
D. Coccomini
Andrea Esuli
Fabrizio Falchi
ArXivPDFHTML

Papers citing "Transformer-Based Multi-modal Proposal and Re-Rank for Wikipedia Image-Caption Matching"

2 / 2 papers shown
Title
VATT: Transformers for Multimodal Self-Supervised Learning from Raw
  Video, Audio and Text
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Hassan Akbari
Liangzhe Yuan
Rui Qian
Wei-Hong Chuang
Shih-Fu Chang
Yin Cui
Boqing Gong
ViT
240
577
0
22 Apr 2021
WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual
  Machine Learning
WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual Machine Learning
Krishna Srinivasan
K. Raman
Jiecao Chen
Michael Bendersky
Marc Najork
VLM
197
310
0
02 Mar 2021
1