Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1711.11248
Cited By
A Closer Look at Spatiotemporal Convolutions for Action Recognition
30 November 2017
Du Tran
Heng Wang
Lorenzo Torresani
Jamie Ray
Yann LeCun
Manohar Paluri
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Closer Look at Spatiotemporal Convolutions for Action Recognition"
50 / 487 papers shown
Title
Multi-modal Collaborative Optimization and Expansion Network for Event-assisted Single-eye Expression Recognition
Runduo Han
Xiuping Liu
Shangxuan Yi
Yi Zhang
Hongchen Tan
6
0
0
17 May 2025
UWAV: Uncertainty-weighted Weakly-supervised Audio-Visual Video Parsing
Yung-Hsuan Lai
Janek Ebbers
Yu-Chiang Frank Wang
François Germain
Michael Jeffrey Jones
Moitreya Chatterjee
26
0
0
14 May 2025
DiGIT: Multi-Dilated Gated Encoder and Central-Adjacent Region Integrated Decoder for Temporal Action Detection Transformer
Ho-Joong Kim
Y. E. Lee
Jung-Ho Hong
Seong-Whan Lee
47
0
0
09 May 2025
Action Spotting and Precise Event Detection in Sports: Datasets, Methods, and Challenges
Hao Xu
Arbind Agrahari Baniya
Sam Well
Mohamed Reda Bouadjenek
Richard Dazeley
S. Aryal
AI4TS
29
0
0
06 May 2025
A simple and effective approach for body part recognition on CT scans based on projection estimation
Franko Hrzic
Mohammadreza Movahhedi
Ophelie Lavoie-Gagne
Ata Kiapour
46
0
0
30 Apr 2025
Learning Streaming Video Representation via Multitask Training
Yibin Yan
Jilan Xu
Shangzhe Di
Yikun Liu
Yudi Shi
Qirui Chen
Zeqian Li
Yifei Huang
Weidi Xie
CLL
84
0
0
28 Apr 2025
Hierarchical and Multimodal Data for Daily Activity Understanding
Ghazal Kaviani
Yavuz Yarici
Seulgi Kim
Mohit Prabhushankar
Ghassan AlRegib
Mashhour Solh
Ameya Patil
57
0
0
24 Apr 2025
Visual Chronicles: Using Multimodal LLMs to Analyze Massive Collections of Images
Boyang Deng
Songyou Peng
Kyle Genova
Gordon Wetzstein
Noah Snavely
Leonidas J. Guibas
Thomas Funkhouser
HAI
202
0
0
11 Apr 2025
Is Temporal Prompting All We Need For Limited Labeled Action Recognition?
Shreyank N. Gowda
Boyan Gao
Xiao Gu
Xiaobo Jin
VLM
41
0
0
02 Apr 2025
Mamba-3D as Masked Autoencoders for Accurate and Data-Efficient Analysis of Medical Ultrasound Videos
Jiaheng Zhou
Yanfeng Zhou
Wei Fang
Yuxing Tang
Le Lu
Ge Yang
Mamba
247
0
0
26 Mar 2025
Adapting to the Unknown: Training-Free Audio-Visual Event Perception with Dynamic Thresholds
E. Shaar
Ariel Shaulov
Gal Chechik
Lior Wolf
VLM
41
0
0
17 Mar 2025
A Large-Scale Study on Video Action Dataset Condensation
Yang Chen
Sheng Guo
Bo Zheng
Limin Wang
DD
81
2
0
13 Mar 2025
ICPR 2024 Competition on Rider Intention Prediction
Shankar Gangisetty
Abdul Wasi
Shyam Nandan Rai
C. V. Jawahar
Sajay Raj
...
Ayesha Choudhary
Aaryadev Chandra
Dev Chandan
Shireen Chand
Suvaditya Mukherjee
50
0
0
11 Mar 2025
Robust Dynamic Facial Expression Recognition
Feng Liu
Hanyang Wang
Siyuan Shen
50
1
0
22 Feb 2025
Natural Language Generation from Visual Sequences: Challenges and Future Directions
Aditya K Surikuchi
Raquel Fernández
Sandro Pezzelle
EGVM
260
0
0
18 Feb 2025
Benchmarking Zero-Shot Facial Emotion Annotation with Large Language Models: A Multi-Class and Multi-Frame Approach in DailyLife
He Zhang
Xinyi Fu
CVBM
50
2
0
18 Feb 2025
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis
Amir Hosein Fadaei
M. Dehaqani
45
0
0
11 Feb 2025
MD-BERT: Action Recognition in Dark Videos via Dynamic Multi-Stream Fusion and Temporal Modeling
Sharana Dharshikgan Suresh Dass
H. Barua
Ganesh Krishnasamy
Raveendran Paramesran
Raphael C.-W. Phan
69
0
0
06 Feb 2025
BILLNET: A Binarized Conv3D-LSTM Network with Logic-gated residual architecture for hardware-efficient video inference
Van Thien Nguyen
William Guicquero
Gilles Sicard
3DV
MQ
82
2
0
24 Jan 2025
SRE-Conv: Symmetric Rotation Equivariant Convolution for Biomedical Image Classification
Yuexi Du
Jiazhen Zhang
Tal Zeevi
Nicha Dvornek
J. Onofrey
61
1
0
17 Jan 2025
A Simple Recipe for Contrastively Pre-training Video-First Encoders Beyond 16 Frames
Pinelopi Papalampidi
Skanda Koppula
Shreya Pathak
Justin T Chiu
Joseph Heyward
Viorica Patraucean
Jiajun Shen
Antoine Miech
Andrew Zisserman
Aida Nematzdeh
VLM
69
24
0
31 Dec 2024
Interact with me: Joint Egocentric Forecasting of Intent to Interact, Attitude and Social Actions
Tongfei Bian
Yiming Ma
Mathieu Chollet
Victor Sanchez
T. Guha
EgoV
99
1
0
21 Dec 2024
Ctrl-GenAug: Controllable Generative Augmentation for Medical Sequence Classification
Xinrui Zhou
Yuhao Huang
Haoran Dou
Shijing Chen
Ao Chang
...
Jie Jessie Ren
Ruobing Huang
Jun Cheng
Wufeng Xue
Dong Ni
MedIm
189
0
0
25 Sep 2024
StimuVAR: Spatiotemporal Stimuli-aware Video Affective Reasoning with Multimodal Large Language Models
Y. Guo
Faizan Siddiqui
Yang Zhao
Rama Chellappa
Shao-Yuan Lo
LRM
49
2
0
31 Aug 2024
EPAM-Net: An Efficient Pose-driven Attention-guided Multimodal Network for Video Action Recognition
Ahmed Abdelkawy
Asem A. Ali
Asem Ali
3DPC
31
0
0
10 Aug 2024
CardioSyntax: end-to-end SYNTAX score prediction -- dataset, benchmark and method
Alexander Ponomarchuk
Ivan Kruzhilov
Galina Zubkova
Artem Shadrin
Ruslan Utegenov
Ivan Bessonov
Pavel Blinov
40
0
0
29 Jul 2024
A Comprehensive Review of Few-shot Action Recognition
Yuyang Wanyan
Xiaoshan Yang
Weiming Dong
Changsheng Xu
VLM
80
3
0
20 Jul 2024
Patch-based Intuitive Multimodal Prototypes Network (PIMPNet) for Alzheimer's Disease classification
Lisa Anita De Santi
Jorg Schlotterer
Meike Nauta
Vincenzo Positano
Christin Seifert
31
2
0
19 Jul 2024
Self-Supervised Video Representation Learning in a Heuristic Decoupled Perspective
Zeen Song
Wenwen Qiang
Jianqi Zhang
Changwen Zheng
Wenwen Qiang
SSL
66
0
0
19 Jul 2024
Rethinking Image-to-Video Adaptation: An Object-centric Perspective
Rui Qian
Shuangrui Ding
Dahua Lin
OCL
52
1
0
09 Jul 2024
FineCLIPER: Multi-modal Fine-grained CLIP for Dynamic Facial Expression Recognition with AdaptERs
Haodong Chen
Haojian Huang
Junhao Dong
Mingzhe Zheng
Dian Shao
45
16
0
02 Jul 2024
Apprenticeship-Inspired Elegance: Synergistic Knowledge Distillation Empowers Spiking Neural Networks for Efficient Single-Eye Emotion Recognition
Yang Wang
Haiyang Mei
Qirui Bao
Ziqi Wei
Mike Zheng Shou
Haizhou Li
Bo Dong
Xin Yang
46
1
0
20 Jun 2024
Video-based Exercise Classification and Activated Muscle Group Prediction with Hybrid X3D-SlowFast Network
Manvik Pasula
Pramit Saha
29
0
0
10 Jun 2024
Simplify Implant Depth Prediction as Video Grounding: A Texture Perceive Implant Depth Prediction Network
Xinquan Yang
Xuguang Li
Xiaoling Luo
Leilei Zeng
Yudi Zhang
Linlin Shen
Yongqiang Deng
MedIm
48
2
0
07 Jun 2024
VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling
Zeyue Tian
Zhaoyang Liu
Ruibin Yuan
Jiahao Pan
Xiaoqiang Huang
Xu Tan
Xu Tan
Qifeng Chen
Yu Guo
VGen
104
16
0
06 Jun 2024
Identity-free Artificial Emotional Intelligence via Micro-Gesture Understanding
Rong Gao
Xin Liu
Bohao Xing
Zitong Yu
Björn W. Schuller
Heikki Kälviäinen
57
3
0
21 May 2024
No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding
Yingjie Zhai
Wenshuo Li
Yehui Tang
Xinghao Chen
Yunhe Wang
ViT
30
0
0
14 May 2024
Enhancing Micro Gesture Recognition for Emotion Understanding via Context-aware Visual-Text Contrastive Learning
Deng Li
Bohao Xing
Xin Liu
37
5
0
03 May 2024
Self-supervised learning for classifying paranasal anomalies in the maxillary sinus
Debayan Bhattacharya
F. Behrendt
B. Becker
Lennart Maack
Dirk Beyersdorff
...
B. Cheng
D. Eggert
C. Betz
A. Hoffmann
Alexander Schlaefer
SSL
31
0
0
29 Apr 2024
A Lightweight Spatiotemporal Network for Online Eye Tracking with Event Camera
Yan Ru Pei
Sasskia Brüers
Sébastien Crouzet
Douglas McLelland
Olivier Coenen
31
8
0
13 Apr 2024
Koala: Key frame-conditioned long video-LLM
Reuben Tan
Ximeng Sun
Ping Hu
Jui-hsien Wang
Hanieh Deilamsalehy
Bryan A. Plummer
Bryan C. Russell
Kate Saenko
38
36
0
05 Apr 2024
A Closer Look at Spatial-Slice Features Learning for COVID-19 Detection
Chih-Chung Hsu
Chia-Ming Lee
Chiang Fan Yang
Yi-Shiuan Chou
Chih-Yu Jiang
Shen-Chieh Tai
Chin-Han Tsai
44
0
0
02 Apr 2024
TCNet: Continuous Sign Language Recognition from Trajectories and Correlated Regions
Hui Lu
A. A. Salah
Ronald Poppe
SLR
32
5
0
18 Mar 2024
On the Utility of 3D Hand Poses for Action Recognition
Md Salman Shamil
Dibyadip Chatterjee
Fadime Sener
Shugao Ma
Angela Yao
40
5
0
14 Mar 2024
Deepfake Detection and the Impact of Limited Computing Capabilities
Paloma Cantero-Arjona
Alfonso Sánchez-Macián
33
2
0
08 Feb 2024
Collaboratively Self-supervised Video Representation Learning for Action Recognition
Jie Zhang
Zhifan Wan
Lanqing Hu
Stephen Lin
Shuzhe Wu
Shiguang Shan
TTA
67
1
0
15 Jan 2024
SonicVisionLM: Playing Sound with Vision Language Models
Zhifeng Xie
Shengye Yu
Qile He
Mengtian Li
VLM
VGen
28
2
0
09 Jan 2024
Video Recognition in Portrait Mode
Mingfei Han
Linjie Yang
Xiaojie Jin
Jiashi Feng
Xiaojun Chang
Heng Wang
30
3
0
21 Dec 2023
ConFormer: A Novel Collection of Deep Learning Models to Assist Cardiologists in the Assessment of Cardiac Function
Ethan Thomas
Salman Aslam
MedIm
34
0
0
13 Dec 2023
Low-power, Continuous Remote Behavioral Localization with Event Cameras
Friedhelm Hamann
Suman Ghosh
Ignacio Juarez Martinez
Tom Hart
Alex Kacelnik
Guillermo Gallego
32
7
0
06 Dec 2023
1
2
3
4
...
8
9
10
Next