Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
1705.06950
Cited By
The Kinetics Human Action Video Dataset
19 May 2017
W. Kay
João Carreira
Karen Simonyan
Brian Zhang
Chloe Hillier
Sudheendra Vijayanarasimhan
Fabio Viola
Tim Green
T. Back
Apostol Natsev
Mustafa Suleyman
Andrew Zisserman
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"The Kinetics Human Action Video Dataset"
50 / 2,148 papers shown
Title
OmniFD: A Unified Model for Versatile Face Forgery Detection
Haotian Liu
Haoyu Chen
Chenhui Pan
You Hu
Guoying Zhao
Xiaobai Li
CVBM
176
0
0
30 Nov 2025
Structured Context Learning for Generic Event Boundary Detection
Xin Gu
Congcong Li
Xinyao Wang
Dexiang Hong
Libo Zhang
Tiejian Luo
Longyin Wen
Heng Fan
60
0
0
29 Nov 2025
Video-CoM: Interactive Video Reasoning via Chain of Manipulations
H. Rasheed
Mohammed Zumri
Muhammad Maaz
Ming-Hsuan Yang
Fahad Shahbaz Khan
Salman Khan
AI4TS
LRM
113
0
0
28 Nov 2025
DisMo: Disentangled Motion Representations for Open-World Motion Transfer
Thomas Ressler-Antal
Frank Fundel
Malek Ben Alaya
S. A. Baumann
Felix Krause
Ming Gui
Bjorn Ommer
DiffM
VGen
41
0
0
28 Nov 2025
GA2-CLIP: Generic Attribute Anchor for Efficient Prompt Tuningin Video-Language Models
Bin Wang
Ruotong Hu
Wenqian Wang
W. Li
Mingliang Gao
Runmin Cong
Wei Zhang
VLM
88
0
0
27 Nov 2025
SkeletonAgent: An Agentic Interaction Framework for Skeleton-based Action Recognition
Hongda Liu
Yunfan Liu
Changlu Wang
Yunlong Wang
Zhenan Sun
LLMAG
136
0
0
27 Nov 2025
Smooth regularization for efficient video recognition
Gil Goldman
Raja Giryes
Mahadev Satyanarayanan
AI4TS
167
0
0
25 Nov 2025
LungEvaty: A Scalable, Open-Source Transformer-based Deep Learning Model for Lung Cancer Risk Prediction in LDCT Screening
Johannes Brandt
Maulik Chevli
R. Braren
Georgios Kaissis
Philip Muller
Daniel Rueckert
LM&MA
MedIm
319
0
0
25 Nov 2025
Modality-Collaborative Low-Rank Decomposers for Few-Shot Video Domain Adaptation
Yuyang Wanyan
Xiaoshan Yang
Weiming Dong
Changsheng Xu
150
0
0
24 Nov 2025
VideoCompressa: Data-Efficient Video Understanding via Joint Temporal Compression and Spatial Reconstruction
Shaobo Wang
Tianle Niu
Runkang Yang
Deshan Liu
Xu He
Zichen Wen
Conghui He
Xuming Hu
Linfeng Zhang
VGen
170
0
0
24 Nov 2025
ViMix-14M: A Curated Multi-Source Video-Text Dataset with Long-Form, High-Quality Captions and Crawl-Free Access
Timing Yang
Sucheng Ren
Alan Yuille
Feng Wang
VGen
114
0
0
23 Nov 2025
Sequence-Adaptive Video Prediction in Continuous Streams using Diffusion Noise Optimization
Sina Mokhtarzadeh Azar
Emad Bahrami
Enrico Pallotta
Gianpiero Francesca
Radu Timofte
Juergen Gall
DiffM
100
0
0
23 Nov 2025
BoxingVI: A Multi-Modal Benchmark for Boxing Action Recognition and Localization
Rahul Kumar
Vipul Baghel
Sudhanshu Singh
Bikash Kumar Badatya
Shivam Yadav
Babji Srinivasan
Ravi S. Hegde
112
0
0
20 Nov 2025
MGCA-Net: Multi-Grained Category-Aware Network for Open-Vocabulary Temporal Action Localization
Zhenying Fang
Richang Hong
128
0
0
17 Nov 2025
RoCoISLR: A Romanian Corpus for Isolated Sign Language Recognition
Cătălin-Alexandru Rîpanu
Andrei-Theodor Hotnog
Giulia-Stefania Imbrea
Dumitru-Clementin Cercel
SLR
265
0
0
16 Nov 2025
Cross-View Cross-Modal Unsupervised Domain Adaptation for Driver Monitoring System
Aditi Bhalla
Christian Hellert
Enkelejda Kasneci
37
0
0
15 Nov 2025
RodEpil: A Video Dataset of Laboratory Rodents for Seizure Detection and Benchmark Evaluation
Daniele Perlo
Vladimir Despotovic
Selma Boudissa
Sang-Yoon Kim
P. V. Nazarov
Yanrong Zhang
Max Wintermark
O. Keunen
64
0
0
13 Nov 2025
EgoEMS: A High-Fidelity Multimodal Egocentric Dataset for Cognitive Assistance in Emergency Medical Services
Keshara Weerasinghe
Xueren Ge
Tessa Heick
L. Wijayasingha
Anthony Cortez
Abhishek Satpathy
John A. Stankovic
H. Alemzadeh
178
0
0
13 Nov 2025
PriVi: Towards A General-Purpose Video Model For Primate Behavior In The Wild
Felix B. Mueller
Jan F. Meier
Timo Lueddecke
Richard Vogg
Roger L. Freixanet
...
Liran Samuni
Oliver Schülke
Neda Shahidi
Erin G. Wessling
Alexander S. Ecker
153
0
0
12 Nov 2025
RadHARSimulator V2: Video to Doppler Generator
Weicheng Gao
48
0
0
12 Nov 2025
FlowFeat: Pixel-Dense Embedding of Motion Profiles
Nikita Araslanov
Anna Sonnweber
Daniel Cremers
MDE
322
1
0
10 Nov 2025
Mitigating Modality Imbalance in Multi-modal Learning via Multi-objective Optimization
Heshan Devaka Fernando
Parikshit Ram
Yi Zhou
Soham Dan
Horst Samulowitz
Nathalie Baracaldo
Tianyi Chen
169
0
0
10 Nov 2025
Otter: Mitigating Background Distractions of Wide-Angle Few-Shot Action Recognition with Enhanced RWKV
Wenbo Huang
Jinghui Zhang
Zhenghao Chen
Guang Li
Lei Zhang
Yang Cao
Fang Dong
Takahiro Ogawa
Miki Haseyama
190
0
0
10 Nov 2025
CAMP-VQA: Caption-Embedded Multimodal Perception for No-Reference Quality Assessment of Compressed Video
Xinyi Wang
Angeliki V. Katsenou
Junxiao Shen
David Bull
76
0
0
10 Nov 2025
Temporal Zoom Networks: Distance Regression and Continuous Depth for Efficient Action Localization
Ibne Farabi Shihab
Sanjeda Akter
Anuj Sharma
148
0
0
06 Nov 2025
DetectiumFire: A Comprehensive Multi-modal Dataset Bridging Vision and Language for Fire Understanding
Zixuan Liu
Siavash H. Khajavi
Guangkai Jiang
VLM
144
0
0
04 Nov 2025
A Cognitive Process-Inspired Architecture for Subject-Agnostic Brain Visual Decoding
Jingyu Lu
Haonan Wang
Qixiang Zhang
Xiaomeng Li
52
0
0
04 Nov 2025
Dynamic Reflections: Probing Video Representations with Text Alignment
Tyler Zhu
Tengda Han
Leonidas Guibas
Viorica Patraucean
M. Ovsjanikov
VGen
233
0
0
04 Nov 2025
Web-Scale Collection of Video Data for 4D Animal Reconstruction
Brian Nlong Zhao
Jiajun Wu
Shangzhe Wu
112
1
0
03 Nov 2025
FastBoost: Progressive Attention with Dynamic Scaling for Efficient Deep Learning
JunXi Yuan
84
0
0
02 Nov 2025
Enhancing Spatio-Temporal Zero-shot Action Recognition with Language-driven Description Attributes
Pattern Recognition (Pattern Recogn.), 2025
Yehna Kim
Y. Kim
Seong-Whan Lee
VLM
99
0
0
31 Oct 2025
GMFVAD: Using Grained Multi-modal Feature to Improve Video Anomaly Detection
Guangyu Dai
Dong Chen
Siliang Tang
Yueting Zhuang
92
0
0
23 Oct 2025
Is This Tracker On? A Benchmark Protocol for Dynamic Tracking
Ilona Demler
Saumya Chauhan
Georgia Gkioxari
90
1
0
22 Oct 2025
FeatureFool: Zero-Query Fooling of Video Models via Feature Map
Duoxun Tang
Xi Xiao
Guangwu Hu
Kangkang Sun
Xiao Yang
Dongyang Chen
Qing Li
Yongjie Yin
Jiyao Wang
AAML
194
1
0
21 Oct 2025
Quantifying Multimodal Imbalance: A GMM-Guided Adaptive Loss for Audio-Visual Learning
Zhaocheng Liu
Zhiwen Yu
Xiaoqing Liu
160
0
0
20 Oct 2025
A Comprehensive Survey on World Models for Embodied AI
Xinqing Li
Xin He
Le Zhang
Yun-Hai Liu
Xiaoli Li
Yun Liu
VGen
LM&Ro
SyDa
216
2
0
19 Oct 2025
StretchySnake: Flexible SSM Training Unlocks Action Recognition Across Spatio-Temporal Scales
Nyle Siddiqui
Rohit Gupta
S. Swetha
Mubarak Shah
140
0
0
17 Oct 2025
Map the Flow: Revealing Hidden Pathways of Information in VideoLLMs
Minji Kim
Taekyung Kim
Bohyung Han
87
0
0
15 Oct 2025
State Space Prompting via Gathering and Spreading Spatio-Temporal Information for Video Understanding
Jiahuan Zhou
Kai Zhu
Zhenyu Cui
Zichen Liu
Xu Zou
Gang Hua
76
1
0
14 Oct 2025
Mixup Helps Understanding Multimodal Video Better
Xiaoyu Ma
Ding Ding
Hao Chen
108
0
0
13 Oct 2025
Image-to-Video Transfer Learning based on Image-Language Foundation Models: A Comprehensive Survey
Jinxuan Li
Chaolei Tan
Haoxuan Chen
Jianxin Ma
Jian-Fang Hu
Wei-Shi Zheng
Jianhuang Lai
VLM
129
1
0
12 Oct 2025
ESCA: Contextualizing Embodied Agents via Scene-Graph Generation
Jiani Huang
Amish Sethi
Matthew Kuo
Mayank Keoliya
Neelay Velingker
JungHo Jung
Ser-Nam Lim
Ziyang Li
Mayur Naik
LM&Ro
VLM
251
0
0
11 Oct 2025
SliceFine: The Universal Winning-Slice Hypothesis for Pretrained Networks
Md. Kowsher
Ali O. Polat
Ehsan Mohammady Ardehaly
Mehrdad Salehi
Zia Ghiasi
Prasanth Murali
Chen Chen
166
1
0
09 Oct 2025
Distributed Algorithms for Multi-Agent Multi-Armed Bandits with Collision
Daoyuan Zhou
Xuchuang Wang
L. Yang
Yang Gao
85
1
0
08 Oct 2025
Aligning Perception, Reasoning, Modeling and Interaction: A Survey on Physical AI
Kun Xiang
Terry Jingchen Zhang
Yinya Huang
Jixi He
Zirong Liu
...
J. N. Han
Hang Xu
Han Li
Bin Dong
Xiaodan Liang
PINN
AI4CE
344
1
0
06 Oct 2025
Rethinking JEPA: Compute-Efficient Video SSL with Frozen Teachers
Xianhang Li
Chen Huang
Chun-Liang Li
Eran Malach
J. Susskind
Vimal Thilak
Etai Littwin
134
1
0
29 Sep 2025
NeRV-Diffusion: Diffuse Implicit Neural Representations for Video Synthesis
Yixuan Ren
Hanyu Wang
Hao Chen
Bo He
Abhinav Shrivastava
DiffM
VGen
108
1
0
29 Sep 2025
Disentangling Static and Dynamic Information for Reducing Static Bias in Action Recognition
Masato Kobayashi
Ning Ding
Toru Tamaki
104
1
0
27 Sep 2025
Category Discovery: An Open-World Perspective
Zhenqi He
Yuanpei Liu
Kai Han
226
1
0
26 Sep 2025
VC-Agent: An Interactive Agent for Customized Video Dataset Collection
Yidan Zhang
Mutian Xu
Yiming Hao
Kun Zhou
Jiahao Chang
Xiaoqiang Liu
Pengfei Wan
Hongbo Fu
Xiaoguang Han
VGen
160
0
0
25 Sep 2025
1
2
3
4
...
41
42
43
Next