ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.08744
  4. Cited By
Are we pretraining it right? Digging deeper into visio-linguistic
  pretraining

Are we pretraining it right? Digging deeper into visio-linguistic pretraining

19 April 2020
Amanpreet Singh
Vedanuj Goswami
Devi Parikh
    VLM
ArXivPDFHTML

Papers citing "Are we pretraining it right? Digging deeper into visio-linguistic pretraining"

11 / 11 papers shown
Title
Training Vision-Language Models with Less Bimodal Supervision
Training Vision-Language Models with Less Bimodal Supervision
Elad Segal
Ben Bogin
Jonathan Berant
VLM
19
2
0
01 Nov 2022
Hateful Memes Challenge: An Enhanced Multimodal Framework
Hateful Memes Challenge: An Enhanced Multimodal Framework
Aijing Gao
Bingjun Wang
Jiaqi Yin
Yating Tian
8
2
0
20 Dec 2021
InfographicVQA
InfographicVQA
Minesh Mathew
Viraj Bagal
Rubèn Pérez Tito
Dimosthenis Karatzas
Ernest Valveny
C. V. Jawahar
14
199
0
26 Apr 2021
Decoupling the Role of Data, Attention, and Losses in Multimodal
  Transformers
Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers
Lisa Anne Hendricks
John F. J. Mellor
R. Schneider
Jean-Baptiste Alayrac
Aida Nematzadeh
75
110
0
31 Jan 2021
Unified Vision-Language Pre-Training for Image Captioning and VQA
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
250
926
0
24 Sep 2019
Supervised Multimodal Bitransformers for Classifying Images and Text
Supervised Multimodal Bitransformers for Classifying Images and Text
Douwe Kiela
Suvrat Bhooshan
Hamed Firooz
Ethan Perez
Davide Testuggine
57
241
0
06 Sep 2019
Boosting Self-Supervised Learning via Knowledge Transfer
Boosting Self-Supervised Learning via Knowledge Transfer
M. Noroozi
Ananth Vinjimoor
Paolo Favaro
Hamed Pirsiavash
SSL
207
291
0
01 May 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,943
0
20 Apr 2018
Neural Baby Talk
Neural Baby Talk
Jiasen Lu
Jianwei Yang
Dhruv Batra
Devi Parikh
VLM
189
434
0
27 Mar 2018
Aggregated Residual Transformations for Deep Neural Networks
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Z. Tu
Kaiming He
261
10,196
0
16 Nov 2016
Multimodal Compact Bilinear Pooling for Visual Question Answering and
  Visual Grounding
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Akira Fukui
Dong Huk Park
Daylen Yang
Anna Rohrbach
Trevor Darrell
Marcus Rohrbach
144
1,464
0
06 Jun 2016
1