Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1711.11248
Cited By
A Closer Look at Spatiotemporal Convolutions for Action Recognition
30 November 2017
Du Tran
Heng Wang
Lorenzo Torresani
Jamie Ray
Yann LeCun
Manohar Paluri
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Closer Look at Spatiotemporal Convolutions for Action Recognition"
50 / 1,270 papers shown
Title
Improved Esophageal Varices Assessment from Non-Contrast CT Scans
Chunli Li
Xiaoming Zhang
Yuan Gao
Xiaoli Yin
Le Lu
Ling Zhang
Ke Yan
Yu Shi
51
0
0
18 Jul 2024
Human-Centric Transformer for Domain Adaptive Action Recognition
Kun-Yu Lin
Jiaming Zhou
Wei-Shi Zheng
38
6
0
15 Jul 2024
VideoMamba: Spatio-Temporal Selective State Space Model
Jinyoung Park
Hee-Seon Kim
Kangwook Ko
Minbeom Kim
Changick Kim
Mamba
42
7
0
11 Jul 2024
Label-anticipated Event Disentanglement for Audio-Visual Video Parsing
Jinxing Zhou
Dan Guo
Yuxin Mao
Yiran Zhong
Xiaojun Chang
Meng Wang
44
12
0
11 Jul 2024
Rethinking Image-to-Video Adaptation: An Object-centric Perspective
Rui Qian
Shuangrui Ding
Dahua Lin
OCL
52
1
0
09 Jul 2024
DailyDVS-200: A Comprehensive Benchmark Dataset for Event-Based Action Recognition
Qi Wang
Zhou Xu
Yuming Lin
Jingtao Ye
Hongsheng Li
Guangming Zhu
Syed Afaq Ali Shah
Mohammed Bennamoun
Liang Zhang
AI4TS
46
5
0
06 Jul 2024
PosMLP-Video: Spatial and Temporal Relative Position Encoding for Efficient Video Recognition
Y. Hao
Diansong Zhou
Zhicai Wang
Chong-Wah Ngo
Meng Wang
ViT
40
5
0
03 Jul 2024
FineCLIPER: Multi-modal Fine-grained CLIP for Dynamic Facial Expression Recognition with AdaptERs
Haodong Chen
Haojian Huang
Junhao Dong
Mingzhe Zheng
Dian Shao
45
16
0
02 Jul 2024
Retain, Blend, and Exchange: A Quality-aware Spatial-Stereo Fusion Approach for Event Stream Recognition
Lan Chen
Dong Li
Xiao Wang
Pengpeng Shao
Wei Zhang
Yaowei Wang
Yonghong Tian
Jin Tang
80
2
0
27 Jun 2024
Dark Transformer: A Video Transformer for Action Recognition in the Dark
Anwaar Ulhaq
ViT
22
0
0
25 Jun 2024
SVFormer: A Direct Training Spiking Transformer for Efficient Video Action Recognition
Liutao Yu
Liwei Huang
Chenlin Zhou
Han Zhang
Zhengyu Ma
Huihui Zhou
Yonghong Tian
ViT
57
4
0
21 Jun 2024
Apprenticeship-Inspired Elegance: Synergistic Knowledge Distillation Empowers Spiking Neural Networks for Efficient Single-Eye Emotion Recognition
Yang Wang
Haiyang Mei
Qirui Bao
Ziqi Wei
Mike Zheng Shou
Haizhou Li
Bo Dong
Xin Yang
46
1
0
20 Jun 2024
Part-aware Unified Representation of Language and Skeleton for Zero-shot Action Recognition
Anqi Zhu
Qiuhong Ke
Mingming Gong
James Bailey
37
6
0
19 Jun 2024
GVT2RPM: An Empirical Study for General Video Transformer Adaptation to Remote Physiological Measurement
Hao Wang
Euijoon Ahn
Jinman Kim
45
0
0
19 Jun 2024
PrAViC: Probabilistic Adaptation Framework for Real-Time Video Classification
Magdalena Trędowicz
Łukasz Struski
Marcin Mazur
Szymon Janusz
Arkadiusz Lewicki
Jacek Tabor
28
1
0
17 Jun 2024
Video-based Exercise Classification and Activated Muscle Group Prediction with Hybrid X3D-SlowFast Network
Manvik Pasula
Pramit Saha
29
0
0
10 Jun 2024
MTS-Net: Dual-Enhanced Positional Multi-Head Self-Attention for 3D CT Diagnosis of May-Thurner Syndrome
Yixin Huang
Yiqi Jin
Ke Tao
Kaijian Xia
Jianfeng Gu
Lei Yu
Lan Du
Cunjian Chen
41
0
0
07 Jun 2024
Simplify Implant Depth Prediction as Video Grounding: A Texture Perceive Implant Depth Prediction Network
Xinquan Yang
Xuguang Li
Xiaoling Luo
Leilei Zeng
Yudi Zhang
Linlin Shen
Yongqiang Deng
MedIm
48
2
0
07 Jun 2024
VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling
Zeyue Tian
Zhaoyang Liu
Ruibin Yuan
Jiahao Pan
Xiaoqiang Huang
Xu Tan
Xu Tan
Qifeng Chen
Yu Guo
VGen
104
16
0
06 Jun 2024
DL-KDD: Dual-Light Knowledge Distillation for Action Recognition in the Dark
Chi-Jui Chang
Oscar Tai-Yuan Chen
Vincent S. Tseng
VLM
31
2
0
04 Jun 2024
Advancing Weakly-Supervised Audio-Visual Video Parsing via Segment-wise Pseudo Labeling
Jinxing Zhou
Dan Guo
Yiran Zhong
Meng Wang
VLM
61
18
0
03 Jun 2024
RNNs, CNNs and Transformers in Human Action Recognition: A Survey and a Hybrid Model
Khaled Alomar
Halil Ibrahim Aysel
Xiaohao Cai
MedIm
ViT
43
7
0
02 Jun 2024
Flow Snapshot Neurons in Action: Deep Neural Networks Generalize to Biological Motion Perception
Shuangpeng Han
Ziyu Wang
Mengmi Zhang
36
0
0
26 May 2024
ARVideo: Autoregressive Pretraining for Self-Supervised Video Representation Learning
Sucheng Ren
Hongru Zhu
Chen Wei
Yijiang Li
Alan Yuille
Cihang Xie
AI4TS
VGen
SSL
59
1
0
24 May 2024
Counterfactual Gradients-based Quantification of Prediction Trust in Neural Networks
Mohit Prabhushankar
Ghassan AlRegib
UQCV
29
0
0
22 May 2024
From CNNs to Transformers in Multimodal Human Action Recognition: A Survey
Muhammad Bilal Shaikh
Syed Mohammed Shamsul Islam
Douglas Chai
Naveed Akhtar
35
9
0
22 May 2024
Inconsistency-Aware Cross-Attention for Audio-Visual Fusion in Dimensional Emotion Recognition
R Gnana Praveen
Jahangir Alam
58
0
0
21 May 2024
Identity-free Artificial Emotional Intelligence via Micro-Gesture Understanding
Rong Gao
Xin Liu
Bohao Xing
Zitong Yu
Björn W. Schuller
Heikki Kälviäinen
57
3
0
21 May 2024
TENNs-PLEIADES: Building Temporal Kernels with Orthogonal Polynomials
Yan Ru Pei
Olivier Coenen
37
3
0
20 May 2024
CoLeaF: A Contrastive-Collaborative Learning Framework for Weakly Supervised Audio-Visual Video Parsing
Faegheh Sardari
A. Mustafa
Philip J. B. Jackson
Adrian Hilton
29
3
0
17 May 2024
Infer Induced Sentiment of Comment Response to Video: A New Task, Dataset and Baseline
Qi Jia
Baoyu Fan
Cong Xu
Lu Liu
Liang Jin
Guoguang Du
Zhenhua Guo
Yaqian Zhao
Xuanjing Huang
Rengang Li
37
0
0
15 May 2024
No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding
Yingjie Zhai
Wenshuo Li
Yehui Tang
Xinghao Chen
Yunhe Wang
ViT
30
0
0
14 May 2024
A Survey on Backbones for Deep Video Action Recognition
Zixuan Tang
Youjun Zhao
Yuhang Wen
Mengyuan Liu
41
1
0
09 May 2024
Adversary-Guided Motion Retargeting for Skeleton Anonymization
Thomas Carr
Depeng Xu
Aidong Lu
19
2
0
08 May 2024
Sign2GPT: Leveraging Large Language Models for Gloss-Free Sign Language Translation
Ryan Wong
Necati Cihan Camgöz
Richard Bowden
SLR
54
21
0
07 May 2024
JOSENet: A Joint Stream Embedding Network for Violence Detection in Surveillance Videos
Pietro Nardelli
Danilo Comminiello
32
0
0
05 May 2024
Enhancing Micro Gesture Recognition for Emotion Understanding via Context-aware Visual-Text Contrastive Learning
Deng Li
Bohao Xing
Xin Liu
37
5
0
03 May 2024
Self-supervised learning for classifying paranasal anomalies in the maxillary sinus
Debayan Bhattacharya
F. Behrendt
B. Becker
Lennart Maack
Dirk Beyersdorff
...
B. Cheng
D. Eggert
C. Betz
A. Hoffmann
Alexander Schlaefer
SSL
31
0
0
29 Apr 2024
General Item Representation Learning for Cold-start Content Recommendations
Jooeun Kim
Jinri Kim
Kwangeun Yeo
Eungi Kim
Kyoung-Woon On
Jonghwan Mun
Joonseok Lee
VLM
32
1
0
22 Apr 2024
CorrNet+: Sign Language Recognition and Translation via Spatial-Temporal Correlation
Lianyu Hu
Wei Feng
Liqing Gao
Zekang Liu
Liang Wan
SLR
32
4
0
17 Apr 2024
STMixer: A One-Stage Sparse Action Detector
Tao Wu
Mengqing Cao
Ziteng Gao
Gangshan Wu
Limin Wang
29
0
0
15 Apr 2024
AI Competitions and Benchmarks: Dataset Development
Romain Egele
Julio C. S. Jacques Junior
Jan N. van Rijn
Isabelle M Guyon
Xavier Baró
Albert Clapés
Prasanna Balaprakash
Sergio Escalera
T. Moeslund
Jun Wan
42
0
0
15 Apr 2024
MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild
K. Chumachenko
Alexandros Iosifidis
Moncef Gabbouj
29
6
0
13 Apr 2024
ChimpVLM: Ethogram-Enhanced Chimpanzee Behaviour Recognition
Otto Brookes
Majid Mirmehdi
H. Kühl
T. Burghardt
35
3
0
13 Apr 2024
A Lightweight Spatiotemporal Network for Online Eye Tracking with Event Camera
Yan Ru Pei
Sasskia Brüers
Sébastien Crouzet
Douglas McLelland
Olivier Coenen
31
8
0
13 Apr 2024
MSSTNet: A Multi-Scale Spatio-Temporal CNN-Transformer Network for Dynamic Facial Expression Recognition
Linhuang Wang
Xin Kang
Fei Ding
Satoshi Nakagawa
Fuji Ren
19
1
0
12 Apr 2024
SportsHHI: A Dataset for Human-Human Interaction Detection in Sports Videos
Tao Wu
Runyu He
Gangshan Wu
Limin Wang
3DH
54
3
0
06 Apr 2024
Koala: Key frame-conditioned long video-LLM
Reuben Tan
Ximeng Sun
Ping Hu
Jui-hsien Wang
Hanieh Deilamsalehy
Bryan A. Plummer
Bryan C. Russell
Kate Saenko
38
36
0
05 Apr 2024
TE-TAD: Towards Full End-to-End Temporal Action Detection via Time-Aligned Coordinate Expression
Ho-Joong Kim
Jung-Ho Hong
Heejo Kong
Seong-Whan Lee
60
5
0
03 Apr 2024
A Closer Look at Spatial-Slice Features Learning for COVID-19 Detection
Chih-Chung Hsu
Chia-Ming Lee
Chiang Fan Yang
Yi-Shiuan Chou
Chih-Yu Jiang
Shen-Chieh Tai
Chin-Han Tsai
44
0
0
02 Apr 2024
Previous
1
2
3
4
5
6
...
24
25
26
Next