Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.06386
Cited By
Towards A Unified Neural Architecture for Visual Recognition and Reasoning
10 November 2023
Calvin Luo
Boqing Gong
Ting Chen
Chen Sun
OCL
ObjD
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Towards A Unified Neural Architecture for Visual Recognition and Reasoning"
6 / 6 papers shown
Title
Pix2seq: A Language Modeling Framework for Object Detection
Ting-Li Chen
Saurabh Saxena
Lala Li
David J. Fleet
Geoffrey E. Hinton
MLLM
ViT
VLM
233
341
0
22 Sep 2021
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
239
2,554
0
04 May 2021
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
283
5,723
0
29 Apr 2021
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Hassan Akbari
Liangzhe Yuan
Rui Qian
Wei-Hong Chuang
Shih-Fu Chang
Yin Cui
Boqing Gong
ViT
231
573
0
22 Apr 2021
On the Binding Problem in Artificial Neural Networks
Klaus Greff
Sjoerd van Steenkiste
Jürgen Schmidhuber
OCL
224
252
0
09 Dec 2020
Learning Object Permanence from Video
Aviv Shamsian
Ofri Kleinfeld
Amir Globerson
Gal Chechik
SSL
29
31
0
23 Mar 2020
1