ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2108.09479
  4. Cited By
Grid-VLP: Revisiting Grid Features for Vision-Language Pre-training

Grid-VLP: Revisiting Grid Features for Vision-Language Pre-training

21 August 2021
Ming Yan
Haiyang Xu
Chenliang Li
Bin Bi
Junfeng Tian
Min Gui
Wei Wang
    VLM
ArXivPDFHTML

Papers citing "Grid-VLP: Revisiting Grid Features for Vision-Language Pre-training"

8 / 8 papers shown
Title
Adapting Pre-trained Language Models to Vision-Language Tasks via
  Dynamic Visual Prompting
Adapting Pre-trained Language Models to Vision-Language Tasks via Dynamic Visual Prompting
Shubin Huang
Qiong Wu
Yiyi Zhou
Weijie Chen
Rongsheng Zhang
Xiaoshuai Sun
Rongrong Ji
VLM
VPVLM
LRM
16
0
0
01 Jun 2023
Masked Vision-Language Transformer in Fashion
Masked Vision-Language Transformer in Fashion
Ge-Peng Ji
Mingchen Zhuge
D. Gao
Deng-Ping Fan
Christos Sakaridis
Luc Van Gool
17
25
0
27 Oct 2022
One does not fit all! On the Complementarity of Vision Encoders for
  Vision and Language Tasks
One does not fit all! On the Complementarity of Vision Encoders for Vision and Language Tasks
Gregor Geigle
Chen Cecilia Liu
Jonas Pfeiffer
Iryna Gurevych
VLM
21
1
0
12 Oct 2022
Injecting Semantic Concepts into End-to-End Image Captioning
Injecting Semantic Concepts into End-to-End Image Captioning
Zhiyuan Fang
Jianfeng Wang
Xiaowei Hu
Lin Liang
Zhe Gan
Lijuan Wang
Yezhou Yang
Zicheng Liu
ViT
VLM
19
85
0
09 Dec 2021
Generating More Pertinent Captions by Leveraging Semantics and Style on
  Multi-Source Datasets
Generating More Pertinent Captions by Leveraging Semantics and Style on Multi-Source Datasets
Marcella Cornia
Lorenzo Baraldi
G. Fiameni
Rita Cucchiara
20
12
0
24 Nov 2021
Achieving Human Parity on Visual Question Answering
Achieving Human Parity on Visual Question Answering
Ming Yan
Haiyang Xu
Chenliang Li
Junfeng Tian
Bin Bi
...
Ji Zhang
Songfang Huang
Fei Huang
Luo Si
Rong Jin
21
12
0
17 Nov 2021
VinVL: Revisiting Visual Representations in Vision-Language Models
VinVL: Revisiting Visual Representations in Vision-Language Models
Pengchuan Zhang
Xiujun Li
Xiaowei Hu
Jianwei Yang
Lei Zhang
Lijuan Wang
Yejin Choi
Jianfeng Gao
ObjD
VLM
252
157
0
02 Jan 2021
Aggregated Residual Transformations for Deep Neural Networks
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Z. Tu
Kaiming He
261
10,106
0
16 Nov 2016
1