Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.07603
Cited By
GLID: Pre-training a Generalist Encoder-Decoder Vision Model
11 April 2024
Jihao Liu
Jinliang Zheng
Yu Liu
Hongsheng Li
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"GLID: Pre-training a Generalist Encoder-Decoder Vision Model"
4 / 4 papers shown
Title
UViM: A Unified Modeling Approach for Vision with Learned Guiding Codes
Alexander Kolesnikov
André Susano Pinto
Lucas Beyer
Xiaohua Zhai
Jeremiah Harmsen
N. Houlsby
103
67
0
20 May 2022
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,337
0
11 Nov 2021
Pix2seq: A Language Modeling Framework for Object Detection
Ting-Li Chen
Saurabh Saxena
Lala Li
David J. Fleet
Geoffrey E. Hinton
MLLM
ViT
VLM
233
341
0
22 Sep 2021
Feature Pyramid Networks for Object Detection
Tsung-Yi Lin
Piotr Dollár
Ross B. Girshick
Kaiming He
Bharath Hariharan
Serge J. Belongie
ObjD
166
21,643
0
09 Dec 2016
1