Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1711.11248
Cited By
A Closer Look at Spatiotemporal Convolutions for Action Recognition
30 November 2017
Du Tran
Heng Wang
Lorenzo Torresani
Jamie Ray
Yann LeCun
Manohar Paluri
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Closer Look at Spatiotemporal Convolutions for Action Recognition"
50 / 1,270 papers shown
Title
A Survey of Recent Advances and Challenges in Deep Audio-Visual Correlation Learning
Luis Vilaca
Yi Yu
Paula Vinan
75
0
0
24 Nov 2024
A Reinforcement Learning-Based Automatic Video Editing Method Using Pre-trained Vision-Language Model
Panwen Hu
Nan Xiao
Feifei Li
Yongquan Chen
Rui Huang
VGen
OffRL
60
3
0
07 Nov 2024
Learning Video Representations without Natural Videos
Xueyang Yu
Xinlei Chen
Yossi Gandelsman
VGen
AI4TS
54
0
0
31 Oct 2024
Assessing the Efficacy of Classical and Deep Neuroimaging Biomarkers in Early Alzheimer's Disease Diagnosis
Milla E. Nielsen
Mads Nielsen
Mostafa Mehdipour-Ghazi
AI4CE
21
0
0
31 Oct 2024
Enhancing Action Recognition by Leveraging the Hierarchical Structure of Actions and Textual Context
Manuel Benavent-Lledo
David Mulero-Pérez
David Ortiz-Perez
José García Rodríguez
Antonis Argyros
26
0
0
28 Oct 2024
LiGAR: LiDAR-Guided Hierarchical Transformer for Multi-Modal Group Activity Recognition
N. V. R. Chappa
Khoa Luu
39
1
0
28 Oct 2024
SparseTem: Boosting the Efficiency of CNN-Based Video Encoders by Exploiting Temporal Continuity
Kaidi Wang
Jieru Zhao
Shuo Yang
Wenchao Ding
M. Guo
30
0
0
28 Oct 2024
MM-WLAuslan: Multi-View Multi-Modal Word-Level Australian Sign Language Recognition Dataset
Xin Shen
Heming Du
Hongwei Sheng
Shuyun Wang
Hui Chen
...
Xiaobiao Du
Jiaying Ying
Ruihan Lu
Qingzheng Xu
Xin Yu
SLR
36
3
0
25 Oct 2024
SpikMamba: When SNN meets Mamba in Event-based Human Action Recognition
Jiaqi Chen
Yan Yang
Shizhuo Deng
Da Teng
Liyuan Pan
Mamba
39
1
0
22 Oct 2024
Generalized Multimodal Fusion via Poisson-Nernst-Planck Equation
Jiayu Xiong
Jing Wang
Hengjing Xiang
Jun Xue
Chen Xu
Zhouqiang Jiang
35
0
0
20 Oct 2024
MoTE: Reconciling Generalization with Specialization for Visual-Language to Video Knowledge Transfer
Minghao Zhu
Zhengpu Wang
Mengxian Hu
Ronghao Dang
Xiao Lin
Xun Zhou
Chengju Liu
Qijun Chen
48
1
0
14 Oct 2024
Fourier-based Action Recognition for Wildlife Behavior Quantification with Event Cameras
Friedhelm Hamann
Suman Ghosh
Ignacio Juarez Martinez
Tom Hart
Alex Kacelnik
Guillermo Gallego
34
0
0
09 Oct 2024
Enhancing Temporal Modeling of Video LLMs via Time Gating
Zi-Yuan Hu
Yiwu Zhong
Shijia Huang
M. Lyu
Liwei Wang
VLM
33
0
0
08 Oct 2024
Tracking objects that change in appearance with phase synchrony
Sabine Muzellec
Drew Linsley
A. Ashok
E. Mingolla
Girik Malik
Rufin VanRullen
Thomas Serre
31
1
0
02 Oct 2024
CycleCrash: A Dataset of Bicycle Collision Videos for Collision Prediction and Analysis
Nishq Poorav Desai
Ali Etemad
Michael A. Greenspan
40
0
0
30 Sep 2024
Ctrl-GenAug: Controllable Generative Augmentation for Medical Sequence Classification
Xinrui Zhou
Yuhao Huang
Haoran Dou
Shijing Chen
Ao Chang
...
Jie Jessie Ren
Ruobing Huang
Jun Cheng
Wufeng Xue
Dong Ni
MedIm
192
0
0
25 Sep 2024
Hand Gesture Classification Based on Forearm Ultrasound Video Snippets Using 3D Convolutional Neural Networks
Keshav Bimbraw
Ankit Talele
Haichong K. Zhang
3DH
16
0
0
24 Sep 2024
Deep Cost Ray Fusion for Sparse Depth Video Completion
Jungeon Kim
S. Kim
Jaesik Park
Seungyong Lee
35
0
0
23 Sep 2024
High-Order Evolving Graphs for Enhanced Representation of Traffic Dynamics
Aditya Humnabadkar
Arindam Sikdar
Benjamin Cave
Huaizhong Zhang
P. Bakaki
Ardhendu Behera
38
0
0
17 Sep 2024
HAVANA: Hierarchical stochastic neighbor embedding for Accelerated Video ANnotAtions
Alexandru Bobe
Jan van Gemert
31
0
0
16 Sep 2024
Multi-Microphone and Multi-Modal Emotion Recognition in Reverberant Environment
Ohad Cohen
Gershon Hazan
Sharon Gannot
29
0
0
14 Sep 2024
An Efficient and Streaming Audio Visual Active Speaker Detection System
Arnav Kundu
Yanzi Jin
Mohammad Hossein Sekhavat
Max Horton
Danny Tormoen
Devang Naik
21
0
0
13 Sep 2024
SRE-CNN: A Spatiotemporal Rotation-Equivariant CNN for Cardiac Cine MR Imaging
Yuliang Zhu
Jing Cheng
Zhuo-Xu Cui
Jianfeng Ren
Chengbo Wang
Dong Liang
26
2
0
13 Sep 2024
TabMixer: Noninvasive Estimation of the Mean Pulmonary Artery Pressure via Imaging and Tabular Data Mixing
Michal K. Grzeszczyk
Przemysław Korzeniowski
S. Alabed
Andrew J Swift
Tomasz Trzciñski
Arkadiusz Sitek
37
0
0
11 Sep 2024
Action-Based ADHD Diagnosis in Video
Yichun Li
Yuxing Yang
Syed Nohsen Naqvi
18
3
0
03 Sep 2024
A Novel Audio-Visual Information Fusion System for Mental Disorders Detection
Yichun Li
Shuanglin Li
S. M. Naqvi
21
0
0
03 Sep 2024
FinePseudo: Improving Pseudo-Labelling through Temporal-Alignablity for Semi-Supervised Fine-Grained Action Recognition
Ishan Rajendrakumar Dave
Mamshad Nayeem Rizve
Mubarak Shah
AI4TS
30
2
0
02 Sep 2024
Sync from the Sea: Retrieving Alignable Videos from Large-Scale Datasets
Ishan Rajendrakumar Dave
Fabian Caba Heilbron
Mubarak Shah
Simon Jenni
46
1
0
02 Sep 2024
StimuVAR: Spatiotemporal Stimuli-aware Video Affective Reasoning with Multimodal Large Language Models
Y. Guo
Faizan Siddiqui
Yang Zhao
Rama Chellappa
Shao-Yuan Lo
LRM
49
2
0
31 Aug 2024
RoboMNIST: A Multimodal Dataset for Multi-Robot Activity Recognition Using WiFi Sensing, Video, and Audio
Kian Behzad
Rojin Zandi
Elaheh Motamedi
Hojjat Salehinejad
Milad Siami
34
0
0
29 Aug 2024
Spatio-Temporal Context Prompting for Zero-Shot Action Detection
Wei-Jhe Huang
Min-Hung Chen
Shang-Hong Lai
40
0
0
28 Aug 2024
Event Stream based Human Action Recognition: A High-Definition Benchmark Dataset and Algorithms
Xiao Wang
Shiao Wang
Pengpeng Shao
Bo Jiang
Lin Zhu
Yonghong Tian
166
2
0
19 Aug 2024
Flatten: Video Action Recognition is an Image Classification task
Junlin Chen
Chengcheng Xu
Yangfan Xu
Jian Yang
Jun Yu Li
Zhiping Shi
39
1
0
17 Aug 2024
Dynamic and Compressive Adaptation of Transformers From Images to Videos
Guozhen Zhang
Jingyu Liu
Shengming Cao
Xiaotong Zhao
Kevin Zhao
Kai Ma
Limin Wang
ViT
29
1
0
13 Aug 2024
ReToMe-VA: Recursive Token Merging for Video Diffusion-based Unrestricted Adversarial Attack
Ziyi Gao
Kai-xiang Chen
Zhipeng Wei
Tingshu Mou
Jingjing Chen
Zhiyu Tan
Hao Li
Yu-Gang Jiang
VGen
AAML
38
2
0
10 Aug 2024
EPAM-Net: An Efficient Pose-driven Attention-guided Multimodal Network for Video Action Recognition
Ahmed Abdelkawy
Asem A. Ali
Asem Ali
3DPC
31
0
0
10 Aug 2024
Enhancing Human Action Recognition and Violence Detection Through Deep Learning Audiovisual Fusion
Pooya Janani
Amirabolfazl Suratgar
Afshin Taghvaeipour
21
2
0
04 Aug 2024
How Effective are Self-Supervised Models for Contact Identification in Videos
Omri Herscovici
Limalka Sadith
Liel David
Daniel Harari
Muhammad Haris Khan
30
0
0
01 Aug 2024
MPT-PAR:Mix-Parameters Transformer for Panoramic Activity Recognition
Wenqing Gan
Yaoyu Li
Jian Li
Zhangang Lin
ViT
32
0
0
01 Aug 2024
Dynamic Gesture Recognition in Ultra-Range Distance for Effective Human-Robot Interaction
Eran Bamani Beeri
Eden Nissinman
A. Sintov
24
0
0
31 Jul 2024
CardioSyntax: end-to-end SYNTAX score prediction -- dataset, benchmark and method
Alexander Ponomarchuk
Ivan Kruzhilov
Galina Zubkova
Artem Shadrin
Ruslan Utegenov
Ivan Bessonov
Pavel Blinov
40
0
0
29 Jul 2024
Babel: A Scalable Pre-trained Model for Multi-Modal Sensing via Expandable Modality Alignment
Shenghong Dai
Shiqi Jiang
Yifan Yang
Ting Cao
Mo Li
Suman Banerjee
Lili Qiu
49
2
0
25 Jul 2024
Is 3D Convolution with 5D Tensors Really Necessary for Video Analysis?
Habib Hajimolahoseini
Walid Ahmed
Austin Wen
Yang Liu
29
0
0
23 Jul 2024
Motion Capture from Inertial and Vision Sensors
Xiaodong Chen
Wu Liu
Qian Bao
Xinchen Liu
Quanwei Yang
Ruoli Dai
Tao Mei
61
3
0
23 Jul 2024
SAM2CLIP2SAM: Vision Language Model for Segmentation of 3D CT Scans for Covid-19 Detection
D. Kollias
Anastasios Arsenos
J. Wingate
Stefanos D. Kollias
42
2
0
22 Jul 2024
StreamTinyNet: video streaming analysis with spatial-temporal TinyML
Hazem Hesham Yousef Shalby
Massimo Pavan
Manuel Roveri
48
0
0
22 Jul 2024
A Comprehensive Review of Few-shot Action Recognition
Yuyang Wanyan
Xiaoshan Yang
Weiming Dong
Changsheng Xu
VLM
80
3
0
20 Jul 2024
Patch-based Intuitive Multimodal Prototypes Network (PIMPNet) for Alzheimer's Disease classification
Lisa Anita De Santi
Jorg Schlotterer
Meike Nauta
Vincenzo Positano
Christin Seifert
31
2
0
19 Jul 2024
Self-Supervised Video Representation Learning in a Heuristic Decoupled Perspective
Zeen Song
Wenwen Qiang
Jianqi Zhang
Changwen Zheng
Wenwen Qiang
SSL
66
0
0
19 Jul 2024
Pose-guided multi-task video transformer for driver action recognition
Ricardo Pizarro
Roberto Valle
L. Bergasa
J. M. Buenaposada
Luis Baumela
ViT
42
0
0
18 Jul 2024
Previous
1
2
3
4
5
...
24
25
26
Next