ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.04730
  4. Cited By
X3D: Expanding Architectures for Efficient Video Recognition

X3D: Expanding Architectures for Efficient Video Recognition

9 April 2020
Christoph Feichtenhofer
ArXivPDFHTML

Papers citing "X3D: Expanding Architectures for Efficient Video Recognition"

50 / 526 papers shown
Title
Model-agnostic Multi-Domain Learning with Domain-Specific Adapters for
  Action Recognition
Model-agnostic Multi-Domain Learning with Domain-Specific Adapters for Action Recognition
Kazuki Omi
Jun Kimata
Toru Tamaki
10
7
0
15 Apr 2022
Continual Inference: A Library for Efficient Online Inference with Deep
  Neural Networks in PyTorch
Continual Inference: A Library for Efficient Online Inference with Deep Neural Networks in PyTorch
Lukas Hedegaard
Alexandros Iosifidis
BDL
3DV
CLL
13
6
0
07 Apr 2022
Long Movie Clip Classification with State-Space Video Models
Long Movie Clip Classification with State-Space Video Models
Md. Mohaiminul Islam
Gedas Bertasius
VLM
23
100
0
04 Apr 2022
TALLFormer: Temporal Action Localization with a Long-memory Transformer
TALLFormer: Temporal Action Localization with a Long-memory Transformer
Feng Cheng
Gedas Bertasius
ViT
16
67
0
04 Apr 2022
TransRAC: Encoding Multi-scale Temporal Correlation with Transformers
  for Repetitive Action Counting
TransRAC: Encoding Multi-scale Temporal Correlation with Transformers for Repetitive Action Counting
Huazhang Hu
Sixun Dong
Yiqun Zhao
Dongze Lian
Zhengxin Li
Shenghua Gao
13
47
0
03 Apr 2022
ObjectMix: Data Augmentation by Copy-Pasting Objects in Videos for
  Action Recognition
ObjectMix: Data Augmentation by Copy-Pasting Objects in Videos for Action Recognition
Jun Kimata
Tomoya Nitta
Toru Tamaki
23
10
0
01 Apr 2022
Deformable Video Transformer
Deformable Video Transformer
Jue Wang
Lorenzo Torresani
ViT
14
27
0
31 Mar 2022
CycDA: Unsupervised Cycle Domain Adaptation from Image to Video
CycDA: Unsupervised Cycle Domain Adaptation from Image to Video
Wei Lin
Anna Kukleva
Kunyang Sun
Horst Possegger
Hilde Kuehne
Horst Bischof
VGen
24
7
0
30 Mar 2022
Class-Incremental Learning for Action Recognition in Videos
Class-Incremental Learning for Action Recognition in Videos
Jaeyoo Park
Minsoo Kang
Bohyung Han
CLL
8
51
0
25 Mar 2022
VideoMAE: Masked Autoencoders are Data-Efficient Learners for
  Self-Supervised Video Pre-Training
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Zhan Tong
Yibing Song
Jue Wang
Limin Wang
ViT
20
768
0
23 Mar 2022
Look for the Change: Learning Object States and State-Modifying Actions
  from Untrimmed Web Videos
Look for the Change: Learning Object States and State-Modifying Actions from Untrimmed Web Videos
Tomávs Souvcek
Jean-Baptiste Alayrac
Antoine Miech
Ivan Laptev
Josef Sivic
10
32
0
22 Mar 2022
FAR: Fourier Aerial Video Recognition
FAR: Fourier Aerial Video Recognition
D. Kothandaraman
Tianrui Guan
Xijun Wang
Sean Hu
Ming-Shun Lin
Dinesh Manocha
13
9
0
21 Mar 2022
DirecFormer: A Directed Attention in Transformer Approach to Robust
  Action Recognition
DirecFormer: A Directed Attention in Transformer Approach to Robust Action Recognition
Thanh-Dat Truong
Quoc-Huy Bui
C. Duong
Han-Seok Seo
S. L. Phung
Xin Li
Khoa Luu
ViT
24
48
0
19 Mar 2022
Group Contextualization for Video Recognition
Group Contextualization for Video Recognition
Y. Hao
Haotong Zhang
Chong-Wah Ngo
Xiangnan He
6
21
0
18 Mar 2022
Surgical Workflow Recognition: from Analysis of Challenges to
  Architectural Study
Surgical Workflow Recognition: from Analysis of Challenges to Architectural Study
Tobias Czempiel
Aidean Sharghi
Magdalini Paschali
Nassir Navab
Omid Mohareri
11
8
0
17 Mar 2022
Gate-Shift-Fuse for Video Action Recognition
Gate-Shift-Fuse for Video Action Recognition
Swathikiran Sudhakaran
Sergio Escalera
O. Lanz
14
22
0
16 Mar 2022
Know your sensORs -- A Modality Study For Surgical Action Classification
Know your sensORs -- A Modality Study For Surgical Action Classification
Lennart Bastian
Tobias Czempiel
C. Heiliger
K. Karcz
U. Eck
Benjamin Busam
Nassir Navab
10
4
0
16 Mar 2022
A Simple Multi-Modality Transfer Learning Baseline for Sign Language
  Translation
A Simple Multi-Modality Transfer Learning Baseline for Sign Language Translation
Yutong Chen
Fangyun Wei
Xiao Sun
Zhirong Wu
Stephen Lin
SLR
12
94
0
08 Mar 2022
PAMI-AD: An Activity Detector Exploiting Part-attention and Motion
  Information in Surveillance Videos
PAMI-AD: An Activity Detector Exploiting Part-attention and Motion Information in Surveillance Videos
Yunhao Du
Zhihang Tong
Jun-Jun Wan
Binyu Zhang
Yanyun Zhao
11
3
0
08 Mar 2022
Learnable Irrelevant Modality Dropout for Multimodal Action Recognition
  on Modality-Specific Annotated Videos
Learnable Irrelevant Modality Dropout for Multimodal Action Recognition on Modality-Specific Annotated Videos
Saghir Alfasly
Jian Lu
C. Xu
Yuru Zou
16
18
0
06 Mar 2022
A Multimodal German Dataset for Automatic Lip Reading Systems and
  Transfer Learning
A Multimodal German Dataset for Automatic Lip Reading Systems and Transfer Learning
Gerald Schwiebert
C. Weber
Leyuan Qu
Henrique Siqueira
S. Wermter
13
11
0
27 Feb 2022
Motion-driven Visual Tempo Learning for Video-based Action Recognition
Motion-driven Visual Tempo Learning for Video-based Action Recognition
Yuanzhong Liu
Junsong Yuan
Zhigang Tu
17
46
0
24 Feb 2022
Movies2Scenes: Using Movie Metadata to Learn Scene Representation
Movies2Scenes: Using Movie Metadata to Learn Scene Representation
Shixing Chen
Chundi Liu
Xiang Hao
Xiaohan Nie
Maxim Arap
Raffay Hamid
10
17
0
22 Feb 2022
Going Deeper into Recognizing Actions in Dark Environments: A
  Comprehensive Benchmark Study
Going Deeper into Recognizing Actions in Dark Environments: A Comprehensive Benchmark Study
Yuecong Xu
Jianfei Yang
Haozhi Cao
Jianxiong Yin
Zhenghua Chen
Xiaoli Li
Zhengguo Li
Qiaoqiao Xu
14
2
0
19 Feb 2022
A Coding Framework and Benchmark towards Compressed Video Understanding
A Coding Framework and Benchmark towards Compressed Video Understanding
Yuan Tian
Guo Lu
Yichao Yan
Guangtao Zhai
L. Chen
Zhiyong Gao
18
21
0
06 Feb 2022
Benchmarking Conventional Vision Models on Neuromorphic Fall Detection
  and Action Recognition Dataset
Benchmarking Conventional Vision Models on Neuromorphic Fall Detection and Action Recognition Dataset
Karthik Sivarama Krishnan
Koushik Sivarama Krishnan
9
5
0
28 Jan 2022
UniFormer: Unifying Convolution and Self-attention for Visual
  Recognition
UniFormer: Unifying Convolution and Self-attention for Visual Recognition
Kunchang Li
Yali Wang
Junhao Zhang
Peng Gao
Guanglu Song
Yu Liu
Hongsheng Li
Yu Qiao
ViT
133
360
0
24 Jan 2022
VIPriors 2: Visual Inductive Priors for Data-Efficient Deep Learning
  Challenges
VIPriors 2: Visual Inductive Priors for Data-Efficient Deep Learning Challenges
A. Lengyel
Robert-Jan Bruintjes
Marcos Baptista-Rios
O. Kayhan
Davide Zambrano
Nergis Tomen
J. C. V. Gemert
VLM
15
11
0
21 Jan 2022
MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient
  Long-Term Video Recognition
MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition
Chao-Yuan Wu
Yanghao Li
K. Mangalam
Haoqi Fan
Bo Xiong
Jitendra Malik
Christoph Feichtenhofer
ViT
24
198
0
20 Jan 2022
Action Keypoint Network for Efficient Video Recognition
Action Keypoint Network for Efficient Video Recognition
Xu Chen
Yahong Han
Xiaohan Wang
Yifang Sun
Yi Yang
3DPC
11
5
0
17 Jan 2022
Video Transformers: A Survey
Video Transformers: A Survey
Javier Selva
A. S. Johansen
Sergio Escalera
Kamal Nasrollahi
T. Moeslund
Albert Clapés
ViT
20
101
0
16 Jan 2022
Argus++: Robust Real-time Activity Detection for Unconstrained Video
  Streams with Overlapping Cube Proposals
Argus++: Robust Real-time Activity Detection for Unconstrained Video Streams with Overlapping Cube Proposals
Lijun Yu
Yijun Qian
Wenhe Liu
Alexander G. Hauptmann
4
13
0
14 Jan 2022
Hand-Object Interaction Reasoning
Hand-Object Interaction Reasoning
Jian Ma
Dima Damen
9
7
0
13 Jan 2022
UniFormer: Unified Transformer for Efficient Spatiotemporal
  Representation Learning
UniFormer: Unified Transformer for Efficient Spatiotemporal Representation Learning
Kunchang Li
Yali Wang
Peng Gao
Guanglu Song
Yu Liu
Hongsheng Li
Yu Qiao
ViT
12
235
0
12 Jan 2022
OCSampler: Compressing Videos to One Clip with Single-step Sampling
OCSampler: Compressing Videos to One Clip with Single-step Sampling
Jintao Lin
Haodong Duan
Kai-xiang Chen
Dahua Lin
Limin Wang
12
24
0
12 Jan 2022
Multiview Transformers for Video Recognition
Multiview Transformers for Video Recognition
Shen Yan
Xuehan Xiong
Anurag Arnab
Zhichao Lu
Mi Zhang
Chen Sun
Cordelia Schmid
ViT
16
211
0
12 Jan 2022
Condensing a Sequence to One Informative Frame for Video Recognition
Condensing a Sequence to One Informative Frame for Video Recognition
Zhaofan Qiu
Ting Yao
Y. Shu
Chong-Wah Ngo
Tao Mei
16
9
0
11 Jan 2022
Optimization Planning for 3D ConvNets
Optimization Planning for 3D ConvNets
Zhaofan Qiu
Ting Yao
Chong-Wah Ngo
Tao Mei
3DPC
3DH
19
9
0
11 Jan 2022
TSA-Net: Tube Self-Attention Network for Action Quality Assessment
TSA-Net: Tube Self-Attention Network for Action Quality Assessment
Shunli Wang
Dingkang Yang
Peng Zhai
Chixiao Chen
Lihua Zhang
ViT
11
63
0
11 Jan 2022
Exploring Motion and Appearance Information for Temporal Sentence
  Grounding
Exploring Motion and Appearance Information for Temporal Sentence Grounding
Daizong Liu
Xiaoye Qu
Pan Zhou
Yang Liu
16
36
0
03 Jan 2022
AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video
  Recognition
AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition
Yulin Wang
Yang Yue
Yuanze Lin
Haojun Jiang
Zihang Lai
V. Kulikov
Nikita Orlov
Humphrey Shi
Gao Huang
11
44
0
28 Dec 2021
Distillation of Human-Object Interaction Contexts for Action Recognition
Distillation of Human-Object Interaction Contexts for Action Recognition
Muna Almushyti
Frederick W. Li
6
3
0
17 Dec 2021
Masked Feature Prediction for Self-Supervised Visual Pre-Training
Masked Feature Prediction for Self-Supervised Visual Pre-Training
Chen Wei
Haoqi Fan
Saining Xie
Chaoxia Wu
Alan Yuille
Christoph Feichtenhofer
ViT
23
651
0
16 Dec 2021
Temporal Shuffling for Defending Deep Action Recognition Models against
  Adversarial Attacks
Temporal Shuffling for Defending Deep Action Recognition Models against Adversarial Attacks
Jaehui Hwang
Huan Zhang
Jun-Ho Choi
Cho-Jui Hsieh
Jong-Seok Lee
AAML
9
2
0
15 Dec 2021
Auto-X3D: Ultra-Efficient Video Understanding via Finer-Grained Neural
  Architecture Search
Auto-X3D: Ultra-Efficient Video Understanding via Finer-Grained Neural Architecture Search
Yifan Jiang
Xinyu Gong
Junru Wu
Humphrey Shi
Zhicheng Yan
Zhangyang Wang
VGen
39
1
0
09 Dec 2021
DualFormer: Local-Global Stratified Transformer for Efficient Video
  Recognition
DualFormer: Local-Global Stratified Transformer for Efficient Video Recognition
Yuxuan Liang
Pan Zhou
Roger Zimmermann
Shuicheng Yan
ViT
11
21
0
09 Dec 2021
Prompting Visual-Language Models for Efficient Video Understanding
Prompting Visual-Language Models for Efficient Video Understanding
Chen Ju
Tengda Han
Kunhao Zheng
Ya-Qin Zhang
Weidi Xie
VPVLM
VLM
9
361
0
08 Dec 2021
MS-TCT: Multi-Scale Temporal ConvTransformer for Action Detection
MS-TCT: Multi-Scale Temporal ConvTransformer for Action Detection
Rui Dai
Srijan Das
Kumara Kahatapitiya
Michael S. Ryoo
F. Brémond
ViT
26
72
0
07 Dec 2021
Dilated convolution with learnable spacings
Dilated convolution with learnable spacings
Ismail Khalfaoui-Hassani
Thomas Pellegrini
T. Masquelier
8
31
0
07 Dec 2021
E$^2$(GO)MOTION: Motion Augmented Event Stream for Egocentric Action
  Recognition
E2^22(GO)MOTION: Motion Augmented Event Stream for Egocentric Action Recognition
Chiara Plizzari
M. Planamente
Gabriele Goletto
Marco Cannici
Emanuele Gusso
Matteo Matteucci
Barbara Caputo
EgoV
10
56
0
07 Dec 2021
Previous
123...1011789
Next