ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1812.02707
  4. Cited By
Video Action Transformer Network

Video Action Transformer Network

6 December 2018
Rohit Girdhar
João Carreira
Carl Doersch
Andrew Zisserman
    ViT
ArXivPDFHTML

Papers citing "Video Action Transformer Network"

22 / 122 papers shown
Title
Context-Aware Personality Inference in Dyadic Scenarios: Introducing the
  UDIVA Dataset
Context-Aware Personality Inference in Dyadic Scenarios: Introducing the UDIVA Dataset
Cristina Palmero
Javier Selva
Sorina Smeureanu
Julio C. S. Jacques Junior
Albert Clapés
...
Zejian Zhang
D. Gallardo-Pujol
G. Guilera
D. Leiva
Sergio Escalera
28
53
0
28 Dec 2020
Preclinical Stage Alzheimer's Disease Detection Using Magnetic Resonance
  Image Scans
Preclinical Stage Alzheimer's Disease Detection Using Magnetic Resonance Image Scans
Fatih Altay
G. Sánchez
Y. James
S. Faraone
Senem Velipasalar
Asif Salekin
15
37
0
28 Nov 2020
t-EVA: Time-Efficient t-SNE Video Annotation
t-EVA: Time-Efficient t-SNE Video Annotation
Soroosh Poorgholi
O. Kayhan
J. C. V. Gemert
9
5
0
26 Nov 2020
We don't Need Thousand Proposals$\colon$ Single Shot Actor-Action
  Detection in Videos
We don't Need Thousand Proposals ⁣:\colon: Single Shot Actor-Action Detection in Videos
A. J. Rana
Y. S. Rawat
ViT
13
11
0
22 Nov 2020
Temporal Stochastic Softmax for 3D CNNs: An Application in Facial
  Expression Recognition
Temporal Stochastic Softmax for 3D CNNs: An Application in Facial Expression Recognition
T. Ayral
M. Pedersoli
Simon L Bacon
Eric Granger
CVBM
3DH
13
11
0
10 Nov 2020
FlowCaps: Optical Flow Estimation with Capsule Networks For Action
  Recognition
FlowCaps: Optical Flow Estimation with Capsule Networks For Action Recognition
Vinoj Jayasundara
D. Roy
Basura Fernando
3DPC
18
3
0
08 Nov 2020
Deep Analysis of CNN-based Spatio-temporal Representations for Action
  Recognition
Deep Analysis of CNN-based Spatio-temporal Representations for Action Recognition
Chun-Fu Chen
Rameswar Panda
K. Ramakrishnan
Rogerio Feris
J. M. Cohn
A. Oliva
Quanfu Fan
21
95
0
22 Oct 2020
Late Temporal Modeling in 3D CNN Architectures with BERT for Action
  Recognition
Late Temporal Modeling in 3D CNN Architectures with BERT for Action Recognition
M. E. Kalfaoglu
Sinan Kalkan
Aydin Alatan
3DPC
28
140
0
03 Aug 2020
Context-Aware RCNN: A Baseline for Action Detection in Videos
Context-Aware RCNN: A Baseline for Action Detection in Videos
Jianchao Wu
Zhanghui Kuang
Limin Wang
Wayne Zhang
Gangshan Wu
22
79
0
20 Jul 2020
Dynamic Graph Representation Learning for Video Dialog via Multi-Modal
  Shuffled Transformers
Dynamic Graph Representation Learning for Video Dialog via Multi-Modal Shuffled Transformers
Shijie Geng
Peng Gao
Moitreya Chatterjee
Chiori Hori
Jonathan Le Roux
Yongfeng Zhang
Hongsheng Li
A. Cherian
19
11
0
08 Jul 2020
Actor-Context-Actor Relation Network for Spatio-Temporal Action
  Localization
Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization
Junting Pan
Siyu Chen
Zheng Shou
Yu Liu
Jing Shao
Hongsheng Li
3DPC
15
150
0
14 Jun 2020
The AVA-Kinetics Localized Human Actions Video Dataset
The AVA-Kinetics Localized Human Actions Video Dataset
Ang Li
Meghana Thotakuri
David A. Ross
João Carreira
Alexander Vostrikov
Andrew Zisserman
VGen
9
133
0
01 May 2020
X3D: Expanding Architectures for Efficient Video Recognition
X3D: Expanding Architectures for Efficient Video Recognition
Christoph Feichtenhofer
66
998
0
09 Apr 2020
Learning Interactions and Relationships between Movie Characters
Learning Interactions and Relationships between Movie Characters
Anna Kukleva
Makarand Tapaswi
Ivan Laptev
36
51
0
29 Mar 2020
Actor-Transformers for Group Activity Recognition
Actor-Transformers for Group Activity Recognition
Kirill Gavrilyuk
Ryan Sanford
Mehrsan Javan
Cees G. M. Snoek
ViT
19
178
0
28 Mar 2020
PIC: Permutation Invariant Convolution for Recognizing Long-range
  Activities
PIC: Permutation Invariant Convolution for Recognizing Long-range Activities
Noureldien Hussein
E. Gavves
A. Smeulders
VLM
18
13
0
18 Mar 2020
Audiovisual SlowFast Networks for Video Recognition
Audiovisual SlowFast Networks for Video Recognition
Fanyi Xiao
Yong Jae Lee
Kristen Grauman
Jitendra Malik
Christoph Feichtenhofer
194
205
0
23 Jan 2020
Action Genome: Actions as Composition of Spatio-temporal Scene Graphs
Action Genome: Actions as Composition of Spatio-temporal Scene Graphs
Jingwei Ji
Ranjay Krishna
Li Fei-Fei
Juan Carlos Niebles
39
335
0
15 Dec 2019
You Only Watch Once: A Unified CNN Architecture for Real-Time
  Spatiotemporal Action Localization
You Only Watch Once: A Unified CNN Architecture for Real-Time Spatiotemporal Action Localization
Okan Kopuklu
Xiangyu Wei
Gerhard Rigoll
20
143
0
15 Nov 2019
Lightweight Network Architecture for Real-Time Action Recognition
Lightweight Network Architecture for Real-Time Action Recognition
Alexander Kozlov
Vadim Andronov
Y. Gritsenko
ViT
19
33
0
21 May 2019
Representation Learning on Visual-Symbolic Graphs for Video
  Understanding
Representation Learning on Visual-Symbolic Graphs for Video Understanding
E. Mavroudi
Benjamín Béjar Haro
René Vidal
24
8
0
17 May 2019
DistInit: Learning Video Representations Without a Single Labeled Video
DistInit: Learning Video Representations Without a Single Labeled Video
Rohit Girdhar
Du Tran
Lorenzo Torresani
Deva Ramanan
19
54
0
26 Jan 2019
Previous
123