Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2110.06915
Cited By
Object-Region Video Transformers
13 October 2021
Roei Herzig
Elad Ben-Avraham
K. Mangalam
Amir Bar
Gal Chechik
Anna Rohrbach
Trevor Darrell
Amir Globerson
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Object-Region Video Transformers"
17 / 17 papers shown
Title
Mamba Fusion: Learning Actions Through Questioning
Zhikang Dong
Apoorva Beedu
Jason Sheinkopf
Irfan Essa
Mamba
51
2
0
17 Sep 2024
3VL: Using Trees to Improve Vision-Language Models' Interpretability
Nir Yellinek
Leonid Karlinsky
Raja Giryes
CoGe
VLM
44
4
0
28 Dec 2023
Object-based (yet Class-agnostic) Video Domain Adaptation
Dantong Niu
Amir Bar
Roei Herzig
Trevor Darrell
Anna Rohrbach
22
1
0
29 Nov 2023
Does Visual Pretraining Help End-to-End Reasoning?
Chen Sun
Calvin Luo
Xingyi Zhou
Anurag Arnab
Cordelia Schmid
OCL
LRM
ViT
28
3
0
17 Jul 2023
How can objects help action recognition?
Xingyi Zhou
Anurag Arnab
Chen Sun
Cordelia Schmid
30
14
0
20 Jun 2023
STOA-VLP: Spatial-Temporal Modeling of Object and Action for Video-Language Pre-training
Weihong Zhong
Mao Zheng
Duyu Tang
Xuan Luo
Heng Gong
Xiaocheng Feng
Bing Qin
19
8
0
20 Feb 2023
AIM: Adapting Image Models for Efficient Video Action Recognition
Taojiannan Yang
Yi Zhu
Yusheng Xie
Aston Zhang
C. L. P. Chen
Mu Li
ViT
24
143
0
06 Feb 2023
Is an Object-Centric Video Representation Beneficial for Transfer?
Chuhan Zhang
Ankush Gupta
Andrew Zisserman
ViT
19
26
0
20 Jul 2022
Structured Video Tokens @ Ego4D PNR Temporal Localization Challenge 2022
Elad Ben-Avraham
Roei Herzig
K. Mangalam
Amir Bar
Anna Rohrbach
Leonid Karlinsky
Trevor Darrell
Amir Globerson
11
3
0
15 Jun 2022
Video Transformers: A Survey
Javier Selva
A. S. Johansen
Sergio Escalera
Kamal Nasrollahi
T. Moeslund
Albert Clapés
ViT
20
101
0
16 Jan 2022
Masked Feature Prediction for Self-Supervised Visual Pre-Training
Chen Wei
Haoqi Fan
Saining Xie
Chaoxia Wu
Alan Yuille
Christoph Feichtenhofer
ViT
25
651
0
16 Dec 2021
Video-Text Pre-training with Learned Regions
Rui Yan
Mike Zheng Shou
Yixiao Ge
Alex Jinpeng Wang
Xudong Lin
Guanyu Cai
Jinhui Tang
17
23
0
02 Dec 2021
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
278
1,939
0
09 Feb 2021
Gaussian Temporal Awareness Networks for Action Localization
Fuchen Long
Ting Yao
Zhaofan Qiu
Xinmei Tian
Jiebo Luo
Tao Mei
122
311
0
09 Sep 2019
BSN: Boundary Sensitive Network for Temporal Action Proposal Generation
Tianwei Lin
Xu Zhao
Haisheng Su
Chongjing Wang
Ming Yang
135
691
0
08 Jun 2018
Image Generation from Scene Graphs
Justin Johnson
Agrim Gupta
Li Fei-Fei
GNN
208
809
0
04 Apr 2018
Improving neural networks by preventing co-adaptation of feature detectors
Geoffrey E. Hinton
Nitish Srivastava
A. Krizhevsky
Ilya Sutskever
Ruslan Salakhutdinov
VLM
243
7,597
0
03 Jul 2012
1