ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.04730
  4. Cited By
X3D: Expanding Architectures for Efficient Video Recognition

X3D: Expanding Architectures for Efficient Video Recognition

9 April 2020
Christoph Feichtenhofer
ArXivPDFHTML

Papers citing "X3D: Expanding Architectures for Efficient Video Recognition"

50 / 526 papers shown
Title
CT-Net: Channel Tensorization Network for Video Classification
CT-Net: Channel Tensorization Network for Video Classification
Kunchang Li
Xianhang Li
Yali Wang
Jun Wang
Yu Qiao
ViT
14
55
0
03 Jun 2021
TSI: Temporal Saliency Integration for Video Action Recognition
TSI: Temporal Saliency Integration for Video Action Recognition
Haisheng Su
Kunchang Li
Jinyuan Feng
Dongliang Wang
Weihao Gan
Wei Wu
Yu Qiao
16
4
0
02 Jun 2021
Continual 3D Convolutional Neural Networks for Real-time Processing of
  Videos
Continual 3D Convolutional Neural Networks for Real-time Processing of Videos
Lukas Hedegaard
Alexandros Iosifidis
3DPC
6
11
0
31 May 2021
DSANet: Dynamic Segment Aggregation Network for Video-Level
  Representation Learning
DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning
Wenhao Wu
Yuxiang Zhao
Yanwu Xu
Xiao Tan
Dongliang He
...
Jinxing Ye
Yingying Li
Mingde Yao
Zichao Dong
Yifeng Shi
AI4TS
15
27
0
25 May 2021
Temporal Action Proposal Generation with Transformers
Temporal Action Proposal Generation with Transformers
Lining Wang
Haosen Yang
Wenhao Wu
H. Yao
Hujie Huang
ViT
17
23
0
25 May 2021
VPN++: Rethinking Video-Pose embeddings for understanding Activities of
  Daily Living
VPN++: Rethinking Video-Pose embeddings for understanding Activities of Daily Living
Srijan Das
Rui Dai
Di Yang
F. Brémond
ViT
28
39
0
17 May 2021
MutualNet: Adaptive ConvNet via Mutual Learning from Different Model
  Configurations
MutualNet: Adaptive ConvNet via Mutual Learning from Different Model Configurations
Taojiannan Yang
Sijie Zhu
Matías Mendieta
Pu Wang
Ravikumar Balakrishnan
Minwoo Lee
T. Han
M. Shah
C. L. P. Chen
3DH
OOD
16
20
0
14 May 2021
AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition
AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition
Rameswar Panda
Chun-Fu Chen
Quanfu Fan
Ximeng Sun
Kate Saenko
A. Oliva
Rogerio Feris
17
41
0
11 May 2021
VideoLT: Large-scale Long-tailed Video Recognition
VideoLT: Large-scale Long-tailed Video Recognition
Xing Zhang
Zuxuan Wu
Zejia Weng
H. Fu
Jingjing Chen
Yu-Gang Jiang
Larry S. Davis
19
41
0
06 May 2021
Revisiting Skeleton-based Action Recognition
Revisiting Skeleton-based Action Recognition
Haodong Duan
Yue Zhao
Kai-xiang Chen
Dahua Lin
Bo Dai
3DH
14
471
0
28 Apr 2021
FrameExit: Conditional Early Exiting for Efficient Video Recognition
FrameExit: Conditional Early Exiting for Efficient Video Recognition
Amir Ghodrati
B. Bejnordi
A. Habibian
26
81
0
27 Apr 2021
VidTr: Video Transformer Without Convolutions
VidTr: Video Transformer Without Convolutions
Yanyi Zhang
Xinyu Li
Chunhui Liu
Bing Shuai
Yi Zhu
Biagio Brattoli
Hao Chen
I. Marsic
Joseph Tighe
ViT
119
178
0
23 Apr 2021
Skip-Convolutions for Efficient Video Processing
Skip-Convolutions for Efficient Video Processing
A. Habibian
Davide Abati
Taco S. Cohen
B. Bejnordi
41
50
0
23 Apr 2021
Multiscale Vision Transformers
Multiscale Vision Transformers
Haoqi Fan
Bo Xiong
K. Mangalam
Yanghao Li
Zhicheng Yan
Jitendra Malik
Christoph Feichtenhofer
ViT
17
1,017
0
22 Apr 2021
H2O: Two Hands Manipulating Objects for First Person Interaction
  Recognition
H2O: Two Hands Manipulating Objects for First Person Interaction Recognition
Taein Kwon
Bugra Tekin
Jan Stühmer
Federica Bogo
Marc Pollefeys
EgoV
18
166
0
22 Apr 2021
VATT: Transformers for Multimodal Self-Supervised Learning from Raw
  Video, Audio and Text
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Hassan Akbari
Liangzhe Yuan
Rui Qian
Wei-Hong Chuang
Shih-Fu Chang
Yin Cui
Boqing Gong
ViT
229
573
0
22 Apr 2021
MGSampler: An Explainable Sampling Strategy for Video Action Recognition
MGSampler: An Explainable Sampling Strategy for Video Action Recognition
Yuan Zhi
Zhan Tong
Limin Wang
Gangshan Wu
TTA
11
71
0
20 Apr 2021
HCMS: Hierarchical and Conditional Modality Selection for Efficient
  Video Recognition
HCMS: Hierarchical and Conditional Modality Selection for Efficient Video Recognition
Zejia Weng
Zuxuan Wu
Hengduo Li
Jingjing Chen
Yu-Gang Jiang
10
3
0
20 Apr 2021
Temporal Query Networks for Fine-grained Video Understanding
Temporal Query Networks for Fine-grained Video Understanding
Chuhan Zhang
Ankush Gupta
Andrew Zisserman
8
69
0
19 Apr 2021
Higher Order Recurrent Space-Time Transformer for Video Action
  Prediction
Higher Order Recurrent Space-Time Transformer for Video Action Prediction
Tsung-Ming Tai
G. Fiameni
Cheng-Kuang Lee
O. Lanz
14
9
0
17 Apr 2021
Beyond Short Clips: End-to-End Video-Level Learning with Collaborative
  Memories
Beyond Short Clips: End-to-End Video-Level Learning with Collaborative Memories
Xitong Yang
Haoqi Fan
Lorenzo Torresani
L. Davis
Heng Wang
VLM
11
19
0
02 Apr 2021
TubeR: Tubelet Transformer for Video Action Detection
TubeR: Tubelet Transformer for Video Action Detection
Jiaojiao Zhao
Yanyi Zhang
Xinyu Li
Hao Chen
Shuai Bing
...
Yuanjun Xiong
Davide Modolo
I. Marsic
Cees G. M. Snoek
Joseph Tighe
ViT
15
69
0
02 Apr 2021
Motion Guided Attention Fusion to Recognize Interactions from Videos
Motion Guided Attention Fusion to Recognize Interactions from Videos
Tae Soo Kim
Jonathan D. Jones
Gregory Hager
22
15
0
01 Apr 2021
Adaptive Configuration of In Situ Lossy Compression for Cosmology
  Simulations via Fine-Grained Rate-Quality Modeling
Adaptive Configuration of In Situ Lossy Compression for Cosmology Simulations via Fine-Grained Rate-Quality Modeling
Sian Jin
Jesus Pulido
Pascal Grosset
Jiannan Tian
Dingwen Tao
J. Ahrens
8
22
0
01 Apr 2021
Learning Representational Invariances for Data-Efficient Action
  Recognition
Learning Representational Invariances for Data-Efficient Action Recognition
Yuliang Zou
Jinwoo Choi
Qitong Wang
Jia-Bin Huang
6
30
0
30 Mar 2021
Augmented Transformer with Adaptive Graph for Temporal Action Proposal
  Generation
Augmented Transformer with Adaptive Graph for Temporal Action Proposal Generation
Shuning Chang
Pichao Wang
F. Wang
Hao Li
Jiashi Feng
ViT
26
41
0
30 Mar 2021
ViViT: A Video Vision Transformer
ViViT: A Video Vision Transformer
Anurag Arnab
Mostafa Dehghani
G. Heigold
Chen Sun
Mario Lucic
Cordelia Schmid
ViT
11
2,041
0
29 Mar 2021
Busy-Quiet Video Disentangling for Video Classification
Busy-Quiet Video Disentangling for Video Classification
Guoxi Huang
A. Bors
16
6
0
29 Mar 2021
An Image is Worth 16x16 Words, What is a Video Worth?
An Image is Worth 16x16 Words, What is a Video Worth?
Gilad Sharir
Asaf Noy
Lihi Zelnik-Manor
ViT
6
94
0
25 Mar 2021
AdaSGN: Adapting Joint Number and Model Size for Efficient
  Skeleton-Based Action Recognition
AdaSGN: Adapting Joint Number and Model Size for Efficient Skeleton-Based Action Recognition
Lei Shi
Yifan Zhang
Jian Cheng
Hanqing Lu
22
46
0
22 Mar 2021
MoViNets: Mobile Video Networks for Efficient Video Recognition
MoViNets: Mobile Video Networks for Efficient Video Recognition
Dan Kondratyuk
Liangzhe Yuan
Yandong Li
Li Zhang
Mingxing Tan
Matthew A. Brown
Boqing Gong
8
225
0
21 Mar 2021
PGT: A Progressive Method for Training Models on Long Videos
PGT: A Progressive Method for Training Models on Long Videos
Bo Pang
Gao Peng
Yizhuo Li
Cewu Lu
VLM
11
9
0
21 Mar 2021
Revisiting ResNets: Improved Training and Scaling Strategies
Revisiting ResNets: Improved Training and Scaling Strategies
Irwan Bello
W. Fedus
Xianzhi Du
E. D. Cubuk
A. Srinivas
Tsung-Yi Lin
Jonathon Shlens
Barret Zoph
17
295
0
13 Mar 2021
ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis
ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis
Yinan He
Bei Gan
Siyu Chen
Yichun Zhou
Guojun Yin
Luchuan Song
Lu Sheng
Jing Shao
Ziwei Liu
AAML
6
103
0
09 Mar 2021
Coarse-Fine Networks for Temporal Activity Detection in Videos
Coarse-Fine Networks for Temporal Activity Detection in Videos
Kumara Kahatapitiya
Michael S. Ryoo
AI4TS
20
33
0
01 Mar 2021
ROAD: The ROad event Awareness Dataset for Autonomous Driving
ROAD: The ROad event Awareness Dataset for Autonomous Driving
Gurkirt Singh
Stephen Akrigg
Manuele Di Maio
Valentina Fontana
Reza Javanmard Alitappeh
...
Salman Khan
S. Grazioso
Andrew Bradley
G. Gironimo
Fabio Cuzzolin
19
89
0
23 Feb 2021
VA-RED$^2$: Video Adaptive Redundancy Reduction
VA-RED2^22: Video Adaptive Redundancy Reduction
Bowen Pan
Rameswar Panda
Camilo Luciano Fosco
Chung-Ching Lin
A. Andonian
Yue Meng
Kate Saenko
A. Oliva
Rogerio Feris
10
19
0
15 Feb 2021
Learning Self-Similarity in Space and Time as Generalized Motion for
  Video Action Recognition
Learning Self-Similarity in Space and Time as Generalized Motion for Video Action Recognition
Heeseung Kwon
Manjin Kim
Suha Kwak
Minsu Cho
TTA
11
38
0
14 Feb 2021
Less is More: ClipBERT for Video-and-Language Learning via Sparse
  Sampling
Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling
Jie Lei
Linjie Li
Luowei Zhou
Zhe Gan
Tamara L. Berg
Mohit Bansal
Jingjing Liu
CLIP
21
565
0
11 Feb 2021
Is Space-Time Attention All You Need for Video Understanding?
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
272
1,939
0
09 Feb 2021
Video Action Recognition Using spatio-temporal optical flow video frames
Video Action Recognition Using spatio-temporal optical flow video frames
Aytekin Nebisoy
Saber Malekzadeh
9
1
0
05 Feb 2021
Semi-Supervised Action Recognition with Temporal Contrastive Learning
Semi-Supervised Action Recognition with Temporal Contrastive Learning
Ankit Singh
Omprakash Chakraborty
Ashutosh Varshney
Rameswar Panda
Rogerio Feris
Kate Saenko
Abir Das
20
94
0
04 Feb 2021
Video Transformer Network
Video Transformer Network
Daniel Neimark
Omri Bar
Maya Zohar
Dotan Asselmann
ViT
188
375
0
01 Feb 2021
Generic Event Boundary Detection: A Benchmark for Event Segmentation
Generic Event Boundary Detection: A Benchmark for Event Segmentation
Mike Zheng Shou
Stan Weixian Lei
Weiyao Wang
Deepti Ghadiyaram
Matt Feiszli
VOS
61
70
0
26 Jan 2021
RGB-D Salient Object Detection via 3D Convolutional Neural Networks
RGB-D Salient Object Detection via 3D Convolutional Neural Networks
Qian Chen
Ze Liu
Y. Zhang
Keren Fu
Qijun Zhao
H. Du
3DPC
24
148
0
25 Jan 2021
Discovering Multi-Label Actor-Action Association in a Weakly Supervised
  Setting
Discovering Multi-Label Actor-Action Association in a Weakly Supervised Setting
Sovan Biswas
Juergen Gall
14
2
0
21 Jan 2021
A Hybrid Attention Mechanism for Weakly-Supervised Temporal Action
  Localization
A Hybrid Attention Mechanism for Weakly-Supervised Temporal Action Localization
Ashraful Islam
Chengjiang Long
Richard J. Radke
14
105
0
03 Jan 2021
2D or not 2D? Adaptive 3D Convolution Selection for Efficient Video
  Recognition
2D or not 2D? Adaptive 3D Convolution Selection for Efficient Video Recognition
Hengduo Li
Zuxuan Wu
Abhinav Shrivastava
L. Davis
19
35
0
29 Dec 2020
Human Action Recognition from Various Data Modalities: A Review
Human Action Recognition from Various Data Modalities: A Review
Zehua Sun
Qiuhong Ke
Hossein Rahmani
Mohammed Bennamoun
Gang Wang
Jun Liu
MU
27
354
0
22 Dec 2020
TDN: Temporal Difference Networks for Efficient Action Recognition
TDN: Temporal Difference Networks for Efficient Action Recognition
Limin Wang
Zhan Tong
Bin Ji
Gangshan Wu
6
338
0
18 Dec 2020
Previous
123...10119
Next