ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.06950
  4. Cited By
The Kinetics Human Action Video Dataset

The Kinetics Human Action Video Dataset

19 May 2017
W. Kay
João Carreira
Karen Simonyan
Brian Zhang
Chloe Hillier
Sudheendra Vijayanarasimhan
Fabio Viola
Tim Green
T. Back
Apostol Natsev
Mustafa Suleyman
Andrew Zisserman
ArXiv (abs)PDFHTML

Papers citing "The Kinetics Human Action Video Dataset"

50 / 2,152 papers shown
Detection of Intoxicated Individuals from Facial Video Sequences via a Recurrent Fusion Model
Detection of Intoxicated Individuals from Facial Video Sequences via a Recurrent Fusion Model
Bita Baroutian
Atefe Aghaei
M. Moghaddam
CVBM
217
0
0
04 Dec 2025
ProtoEFNet: Dynamic Prototype Learning for Inherently Interpretable Ejection Fraction Estimation in Echocardiography
ProtoEFNet: Dynamic Prototype Learning for Inherently Interpretable Ejection Fraction Estimation in Echocardiography
Yeganeh Ghamary
Victoria Wu
H. Vaseli
C. Luong
T. Tsang
Siavash Bigdeli
Purang Abolmaesumi
77
0
0
03 Dec 2025
Unique Lives, Shared World: Learning from Single-Life Videos
Unique Lives, Shared World: Learning from Single-Life Videos
Tengda Han
Sayna Ebrahimi
Dilara Gokay
Li Yang Ku
M. Ovsjanikov
...
Daniel Zoran
Viorica Patraucean
João Carreira
Andrew Zisserman
Dima Damen
161
0
0
03 Dec 2025
Heatmap Pooling Network for Action Recognition from RGB Videos
Heatmap Pooling Network for Action Recognition from RGB VideosIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025
Mengyuan Liu
Jinfu Liu
Yongkang Jiang
Bin He
92
0
0
03 Dec 2025
OmniFD: A Unified Model for Versatile Face Forgery Detection
OmniFD: A Unified Model for Versatile Face Forgery Detection
Haotian Liu
Haoyu Chen
Chenhui Pan
You Hu
Guoying Zhao
Xiaobai Li
CVBM
291
0
0
30 Nov 2025
Structured Context Learning for Generic Event Boundary Detection
Structured Context Learning for Generic Event Boundary Detection
Xin Gu
Congcong Li
Xinyao Wang
Dexiang Hong
Libo Zhang
Tiejian Luo
Longyin Wen
Heng Fan
72
0
0
29 Nov 2025
DisMo: Disentangled Motion Representations for Open-World Motion Transfer
DisMo: Disentangled Motion Representations for Open-World Motion Transfer
Thomas Ressler-Antal
Frank Fundel
Malek Ben Alaya
S. A. Baumann
Felix Krause
Ming Gui
Bjorn Ommer
DiffMVGen
105
0
0
28 Nov 2025
Video-CoM: Interactive Video Reasoning via Chain of Manipulations
Video-CoM: Interactive Video Reasoning via Chain of Manipulations
H. Rasheed
Mohammed Zumri
Muhammad Maaz
Ming-Hsuan Yang
Fahad Shahbaz Khan
Salman Khan
AI4TSLRM
166
0
0
28 Nov 2025
GA2-CLIP: Generic Attribute Anchor for Efficient Prompt Tuningin Video-Language Models
GA2-CLIP: Generic Attribute Anchor for Efficient Prompt Tuningin Video-Language Models
Bin Wang
Ruotong Hu
Wenqian Wang
W. Li
Mingliang Gao
Runmin Cong
Wei Zhang
VLM
128
0
0
27 Nov 2025
SkeletonAgent: An Agentic Interaction Framework for Skeleton-based Action Recognition
SkeletonAgent: An Agentic Interaction Framework for Skeleton-based Action Recognition
Hongda Liu
Yunfan Liu
Changlu Wang
Yunlong Wang
Zhenan Sun
LLMAG
228
0
0
27 Nov 2025
LungEvaty: A Scalable, Open-Source Transformer-based Deep Learning Model for Lung Cancer Risk Prediction in LDCT Screening
LungEvaty: A Scalable, Open-Source Transformer-based Deep Learning Model for Lung Cancer Risk Prediction in LDCT Screening
Johannes Brandt
Maulik Chevli
R. Braren
Georgios Kaissis
Philip Muller
Daniel Rueckert
LM&MAMedIm
335
0
0
25 Nov 2025
Smooth regularization for efficient video recognition
Smooth regularization for efficient video recognition
Gil Goldman
Raja Giryes
Mahadev Satyanarayanan
AI4TS
220
0
0
25 Nov 2025
VideoCompressa: Data-Efficient Video Understanding via Joint Temporal Compression and Spatial Reconstruction
VideoCompressa: Data-Efficient Video Understanding via Joint Temporal Compression and Spatial Reconstruction
Shaobo Wang
Tianle Niu
Runkang Yang
Deshan Liu
Xu He
Zichen Wen
Conghui He
Xuming Hu
Linfeng Zhang
VGen
194
0
0
24 Nov 2025
Modality-Collaborative Low-Rank Decomposers for Few-Shot Video Domain Adaptation
Modality-Collaborative Low-Rank Decomposers for Few-Shot Video Domain Adaptation
Yuyang Wanyan
Xiaoshan Yang
Weiming Dong
Changsheng Xu
159
0
0
24 Nov 2025
ViMix-14M: A Curated Multi-Source Video-Text Dataset with Long-Form, High-Quality Captions and Crawl-Free Access
ViMix-14M: A Curated Multi-Source Video-Text Dataset with Long-Form, High-Quality Captions and Crawl-Free Access
Timing Yang
Sucheng Ren
Alan Yuille
Feng Wang
VGen
123
0
0
23 Nov 2025
Sequence-Adaptive Video Prediction in Continuous Streams using Diffusion Noise Optimization
Sequence-Adaptive Video Prediction in Continuous Streams using Diffusion Noise Optimization
Sina Mokhtarzadeh Azar
Emad Bahrami
Enrico Pallotta
Gianpiero Francesca
Radu Timofte
Juergen Gall
DiffM
121
0
0
23 Nov 2025
BoxingVI: A Multi-Modal Benchmark for Boxing Action Recognition and Localization
Rahul Kumar
Vipul Baghel
Sudhanshu Singh
Bikash Kumar Badatya
Shivam Yadav
Babji Srinivasan
Ravi S. Hegde
137
0
0
20 Nov 2025
MGCA-Net: Multi-Grained Category-Aware Network for Open-Vocabulary Temporal Action Localization
MGCA-Net: Multi-Grained Category-Aware Network for Open-Vocabulary Temporal Action Localization
Zhenying Fang
Richang Hong
151
0
0
17 Nov 2025
RoCoISLR: A Romanian Corpus for Isolated Sign Language Recognition
RoCoISLR: A Romanian Corpus for Isolated Sign Language Recognition
Cătălin-Alexandru Rîpanu
Andrei-Theodor Hotnog
Giulia-Stefania Imbrea
Dumitru-Clementin Cercel
SLR
304
0
0
16 Nov 2025
Cross-View Cross-Modal Unsupervised Domain Adaptation for Driver Monitoring System
Cross-View Cross-Modal Unsupervised Domain Adaptation for Driver Monitoring System
Aditi Bhalla
Christian Hellert
Enkelejda Kasneci
76
0
0
15 Nov 2025
RodEpil: A Video Dataset of Laboratory Rodents for Seizure Detection and Benchmark Evaluation
RodEpil: A Video Dataset of Laboratory Rodents for Seizure Detection and Benchmark Evaluation
Daniele Perlo
Vladimir Despotovic
Selma Boudissa
Sang-Yoon Kim
P. V. Nazarov
Yanrong Zhang
Max Wintermark
O. Keunen
99
0
0
13 Nov 2025
EgoEMS: A High-Fidelity Multimodal Egocentric Dataset for Cognitive Assistance in Emergency Medical Services
EgoEMS: A High-Fidelity Multimodal Egocentric Dataset for Cognitive Assistance in Emergency Medical Services
Keshara Weerasinghe
Xueren Ge
Tessa Heick
L. Wijayasingha
Anthony Cortez
Abhishek Satpathy
John A. Stankovic
H. Alemzadeh
200
0
0
13 Nov 2025
PriVi: Towards A General-Purpose Video Model For Primate Behavior In The Wild
PriVi: Towards A General-Purpose Video Model For Primate Behavior In The Wild
Felix B. Mueller
Jan F. Meier
Timo Lueddecke
Richard Vogg
Roger L. Freixanet
...
Liran Samuni
Oliver Schülke
Neda Shahidi
Erin G. Wessling
Alexander S. Ecker
178
0
0
12 Nov 2025
RadHARSimulator V2: Video to Doppler Generator
RadHARSimulator V2: Video to Doppler Generator
Weicheng Gao
84
0
0
12 Nov 2025
FlowFeat: Pixel-Dense Embedding of Motion Profiles
FlowFeat: Pixel-Dense Embedding of Motion Profiles
Nikita Araslanov
Anna Sonnweber
Daniel Cremers
MDE
360
1
0
10 Nov 2025
Otter: Mitigating Background Distractions of Wide-Angle Few-Shot Action Recognition with Enhanced RWKV
Otter: Mitigating Background Distractions of Wide-Angle Few-Shot Action Recognition with Enhanced RWKV
Wenbo Huang
Jinghui Zhang
Zhenghao Chen
Guang Li
Lei Zhang
Yang Cao
Fang Dong
Takahiro Ogawa
Miki Haseyama
225
0
0
10 Nov 2025
CAMP-VQA: Caption-Embedded Multimodal Perception for No-Reference Quality Assessment of Compressed Video
CAMP-VQA: Caption-Embedded Multimodal Perception for No-Reference Quality Assessment of Compressed Video
Xinyi Wang
Angeliki V. Katsenou
Junxiao Shen
David Bull
93
0
0
10 Nov 2025
Mitigating Modality Imbalance in Multi-modal Learning via Multi-objective Optimization
Mitigating Modality Imbalance in Multi-modal Learning via Multi-objective Optimization
Heshan Devaka Fernando
Parikshit Ram
Yi Zhou
Soham Dan
Horst Samulowitz
Nathalie Baracaldo
Tianyi Chen
229
0
0
10 Nov 2025
Temporal Zoom Networks: Distance Regression and Continuous Depth for Efficient Action Localization
Temporal Zoom Networks: Distance Regression and Continuous Depth for Efficient Action Localization
Ibne Farabi Shihab
Sanjeda Akter
Anuj Sharma
199
0
0
06 Nov 2025
DetectiumFire: A Comprehensive Multi-modal Dataset Bridging Vision and Language for Fire Understanding
DetectiumFire: A Comprehensive Multi-modal Dataset Bridging Vision and Language for Fire Understanding
Zixuan Liu
Siavash H. Khajavi
Guangkai Jiang
VLM
183
0
0
04 Nov 2025
A Cognitive Process-Inspired Architecture for Subject-Agnostic Brain Visual Decoding
A Cognitive Process-Inspired Architecture for Subject-Agnostic Brain Visual Decoding
Jingyu Lu
Haonan Wang
Qixiang Zhang
Xiaomeng Li
82
0
0
04 Nov 2025
Dynamic Reflections: Probing Video Representations with Text Alignment
Dynamic Reflections: Probing Video Representations with Text Alignment
Tyler Zhu
Tengda Han
Leonidas Guibas
Viorica Patraucean
M. Ovsjanikov
VGen
253
0
0
04 Nov 2025
Web-Scale Collection of Video Data for 4D Animal Reconstruction
Web-Scale Collection of Video Data for 4D Animal Reconstruction
Brian Nlong Zhao
Jiajun Wu
Shangzhe Wu
125
1
0
03 Nov 2025
FastBoost: Progressive Attention with Dynamic Scaling for Efficient Deep Learning
FastBoost: Progressive Attention with Dynamic Scaling for Efficient Deep Learning
JunXi Yuan
121
0
0
02 Nov 2025
Enhancing Spatio-Temporal Zero-shot Action Recognition with Language-driven Description Attributes
Enhancing Spatio-Temporal Zero-shot Action Recognition with Language-driven Description AttributesPattern Recognition (Pattern Recogn.), 2025
Yehna Kim
Y. Kim
Seong-Whan Lee
VLM
128
0
0
31 Oct 2025
GMFVAD: Using Grained Multi-modal Feature to Improve Video Anomaly Detection
GMFVAD: Using Grained Multi-modal Feature to Improve Video Anomaly Detection
Guangyu Dai
Dong Chen
Siliang Tang
Yueting Zhuang
108
0
0
23 Oct 2025
Is This Tracker On? A Benchmark Protocol for Dynamic Tracking
Is This Tracker On? A Benchmark Protocol for Dynamic Tracking
Ilona Demler
Saumya Chauhan
Georgia Gkioxari
111
1
0
22 Oct 2025
FeatureFool: Zero-Query Fooling of Video Models via Feature Map
FeatureFool: Zero-Query Fooling of Video Models via Feature Map
Duoxun Tang
Xi Xiao
Guangwu Hu
Kangkang Sun
Xiao Yang
Dongyang Chen
Qing Li
Yongjie Yin
Jiyao Wang
AAML
230
1
0
21 Oct 2025
Quantifying Multimodal Imbalance: A GMM-Guided Adaptive Loss for Audio-Visual Learning
Quantifying Multimodal Imbalance: A GMM-Guided Adaptive Loss for Audio-Visual Learning
Zhaocheng Liu
Zhiwen Yu
Xiaoqing Liu
204
0
0
20 Oct 2025
A Comprehensive Survey on World Models for Embodied AI
A Comprehensive Survey on World Models for Embodied AI
Xinqing Li
Xin He
Le Zhang
Yun-Hai Liu
Xiaoli Li
Yun-Hai Liu
VGenLM&RoSyDa
252
5
0
19 Oct 2025
StretchySnake: Flexible SSM Training Unlocks Action Recognition Across Spatio-Temporal Scales
StretchySnake: Flexible SSM Training Unlocks Action Recognition Across Spatio-Temporal Scales
Nyle Siddiqui
Rohit Gupta
S. Swetha
Mubarak Shah
153
0
0
17 Oct 2025
Map the Flow: Revealing Hidden Pathways of Information in VideoLLMs
Map the Flow: Revealing Hidden Pathways of Information in VideoLLMs
Minji Kim
Taekyung Kim
Bohyung Han
98
0
0
15 Oct 2025
State Space Prompting via Gathering and Spreading Spatio-Temporal Information for Video Understanding
State Space Prompting via Gathering and Spreading Spatio-Temporal Information for Video Understanding
Jiahuan Zhou
Kai Zhu
Zhenyu Cui
Zichen Liu
Xu Zou
Gang Hua
95
1
0
14 Oct 2025
Mixup Helps Understanding Multimodal Video Better
Mixup Helps Understanding Multimodal Video Better
Xiaoyu Ma
Ding Ding
Hao Chen
124
0
0
13 Oct 2025
Image-to-Video Transfer Learning based on Image-Language Foundation Models: A Comprehensive Survey
Image-to-Video Transfer Learning based on Image-Language Foundation Models: A Comprehensive Survey
Jinxuan Li
Chaolei Tan
Haoxuan Chen
Jianxin Ma
Jian-Fang Hu
Wei-Shi Zheng
Jianhuang Lai
VLM
151
1
0
12 Oct 2025
ESCA: Contextualizing Embodied Agents via Scene-Graph Generation
ESCA: Contextualizing Embodied Agents via Scene-Graph Generation
Jiani Huang
Amish Sethi
Matthew Kuo
Mayank Keoliya
Neelay Velingker
JungHo Jung
Ser-Nam Lim
Ziyang Li
Mayur Naik
LM&RoVLM
282
0
0
11 Oct 2025
SliceFine: The Universal Winning-Slice Hypothesis for Pretrained Networks
SliceFine: The Universal Winning-Slice Hypothesis for Pretrained Networks
Md. Kowsher
Ali O. Polat
Ehsan Mohammady Ardehaly
Mehrdad Salehi
Zia Ghiasi
Prasanth Murali
Chen Chen
186
2
0
09 Oct 2025
Distributed Algorithms for Multi-Agent Multi-Armed Bandits with Collision
Distributed Algorithms for Multi-Agent Multi-Armed Bandits with Collision
Daoyuan Zhou
Xuchuang Wang
L. Yang
Yang Gao
159
1
0
08 Oct 2025
Aligning Perception, Reasoning, Modeling and Interaction: A Survey on Physical AI
Aligning Perception, Reasoning, Modeling and Interaction: A Survey on Physical AI
Kun Xiang
Terry Jingchen Zhang
Yinya Huang
Jixi He
Zirong Liu
...
J. N. Han
Hang Xu
Han Li
Bin Dong
Xiaodan Liang
PINNAI4CE
376
1
0
06 Oct 2025
Rethinking JEPA: Compute-Efficient Video SSL with Frozen Teachers
Rethinking JEPA: Compute-Efficient Video SSL with Frozen Teachers
Xianhang Li
Chen Huang
Chun-Liang Li
Eran Malach
J. Susskind
Vimal Thilak
Etai Littwin
160
1
0
29 Sep 2025
1234...424344
Next