ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.06950
  4. Cited By
The Kinetics Human Action Video Dataset

The Kinetics Human Action Video Dataset

19 May 2017
W. Kay
João Carreira
Karen Simonyan
Brian Zhang
Chloe Hillier
Sudheendra Vijayanarasimhan
Fabio Viola
Tim Green
T. Back
Apostol Natsev
Mustafa Suleyman
Andrew Zisserman
ArXiv (abs)PDFHTML

Papers citing "The Kinetics Human Action Video Dataset"

50 / 2,152 papers shown
PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance
PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance
Ruyang Liu
Haoran Tang
Haibo Liu
Yixiao Ge
Mingyu Ding
Chen Li
Jiankun Yang
VLM
243
17
0
04 Nov 2024
SPECTRUM: Semantic Processing and Emotion-informed video-Captioning
  Through Retrieval and Understanding Modalities
SPECTRUM: Semantic Processing and Emotion-informed video-Captioning Through Retrieval and Understanding Modalities
Ehsan Faghihi
Mohammedreza Zarenejad
Ali-Asghar Beheshti Shirazi
271
2
0
04 Nov 2024
Constrained Human-AI Cooperation: An Inclusive Embodied Social Intelligence Challenge
Constrained Human-AI Cooperation: An Inclusive Embodied Social Intelligence ChallengeNeural Information Processing Systems (NeurIPS), 2024
Weihua Du
Qiushi Lyu
Jiaming Shan
Zhenting Qi
Hongxin Zhang
...
Andi Peng
Tianmin Shu
Kwonjoon Lee
Behzad Dariush
Chuang Gan
471
9
0
04 Nov 2024
ROAD-Waymo: Action Awareness at Scale for Autonomous Driving
ROAD-Waymo: Action Awareness at Scale for Autonomous Driving
Salman Khan
Izzeddin Teeti
Reza Javanmard Alitappeh
Mihaela C. Stoian
Eleonora Giunchiglia
Gurkirt Singh
Andrew Bradley
Fabio Cuzzolin
292
2
0
03 Nov 2024
STAA: Spatio-Temporal Attention Attribution for Real-Time Interpreting
  Transformer-based Video Models
STAA: Spatio-Temporal Attention Attribution for Real-Time Interpreting Transformer-based Video Models
Zerui Wang
Yan Liu
312
6
0
01 Nov 2024
Learning Video Representations without Natural Videos
Learning Video Representations without Natural Videos
Xueyang Yu
Xinlei Chen
Yossi Gandelsman
VGenAI4TS
247
2
0
31 Oct 2024
Video Token Merging for Long-form Video Understanding
Video Token Merging for Long-form Video Understanding
Seon-Ho Lee
Jue Wang
Zhikang Zhang
D. Fan
Xinyu Li
290
15
0
31 Oct 2024
Deep Convolutional Neural Networks on Multiclass Classification of Three-Dimensional Brain Images for Parkinson's Disease Stage Prediction
Deep Convolutional Neural Networks on Multiclass Classification of Three-Dimensional Brain Images for Parkinson's Disease Stage Prediction
Guan-Hua Huang
Wan-Chen Lai
Tai-Been Chen
Chien-Chin Hsu
Huei-Yung Chen
Yi-Chen Wu
Li-Ren Yeh
MedIm
183
3
0
31 Oct 2024
EchoFM: Foundation Model for Generalizable Echocardiogram Analysis
EchoFM: Foundation Model for Generalizable Echocardiogram AnalysisIEEE Transactions on Medical Imaging (IEEE TMI), 2024
Sekeun Kim
Pengfei Jin
Qing Xiao
Cheng Chen
Yiwei Li
Hui Ren
Xiang Li
Tianming Liu
Quanzheng Li
265
5
0
30 Oct 2024
AtGCN: A Graph Convolutional Network For Ataxic Gait Detection
AtGCN: A Graph Convolutional Network For Ataxic Gait Detection
Karan Bania
Tanmay Verlekar
117
2
0
30 Oct 2024
Multi-Level Feature Distillation of Joint Teachers Trained on Distinct
  Image Datasets
Multi-Level Feature Distillation of Joint Teachers Trained on Distinct Image DatasetsIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024
Adrian Iordache
B. Alexe
Radu Tudor Ionescu
304
2
0
29 Oct 2024
Analytic Continual Test-Time Adaptation for Multi-Modality Corruption
Analytic Continual Test-Time Adaptation for Multi-Modality Corruption
Yufei Zhang
Yicheng Xu
Jianguo Huang
Zhiping Lin
Xiaofeng Zou
Cen Chen
Huiping Zhuang
TTA
212
1
0
29 Oct 2024
USpeech: Ultrasound-Enhanced Speech with Minimal Human Effort via Cross-Modal Synthesis
USpeech: Ultrasound-Enhanced Speech with Minimal Human Effort via Cross-Modal SynthesisProceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies (IMWUT), 2024
Luca Jiang-Tao Yu
Running Zhao
Sijie Ji
Edith C.H. Ngai
Chenshu Wu
198
3
0
29 Oct 2024
LiGAR: LiDAR-Guided Hierarchical Transformer for Multi-Modal Group
  Activity Recognition
LiGAR: LiDAR-Guided Hierarchical Transformer for Multi-Modal Group Activity RecognitionIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024
N. V. R. Chappa
Khoa Luu
177
2
0
28 Oct 2024
BlinkVision: A Benchmark for Optical Flow, Scene Flow and Point Tracking Estimation using RGB Frames and Events
BlinkVision: A Benchmark for Optical Flow, Scene Flow and Point Tracking Estimation using RGB Frames and EventsEuropean Conference on Computer Vision (ECCV), 2024
Yijin Li
Yichen Shen
Zhaoyang Huang
Shuo Chen
Weikang Bian
...
Keqiang Sun
Hujun Bao
Zhaopeng Cui
Guofeng Zhang
Hongsheng Li
3DPC
402
7
0
27 Oct 2024
NeuroClips: Towards High-fidelity and Smooth fMRI-to-Video
  Reconstruction
NeuroClips: Towards High-fidelity and Smooth fMRI-to-Video ReconstructionNeural Information Processing Systems (NeurIPS), 2024
Z. Gong
Guangyin Bao
Tao Gui
Zhongwei Wan
Duoqian Miao
...
Changwei Wang
Rongtao Xu
Liang Hu
Ke Liu
Yu Zhang
DiffMVGen
256
25
0
25 Oct 2024
Low-Latency Video Anonymization for Crowd Anomaly Detection: Privacy vs.
  Performance
Low-Latency Video Anonymization for Crowd Anomaly Detection: Privacy vs. Performance
M. Asres
Lei Jiao
C. Omlin
208
0
0
24 Oct 2024
AlphaChimp: Tracking and Behavior Recognition of Chimpanzees
AlphaChimp: Tracking and Behavior Recognition of Chimpanzees
Xiaoxuan Ma
Yutang Lin
Yuan Xu
Stephan P. Kaufhold
Jack Terwilliger
Andres Meza
Yixin Zhu
Federico Rossano
Yizhou Wang
449
4
0
22 Oct 2024
Masked Differential Privacy
Masked Differential Privacy
David Schneider
Sina Sajadmanesh
Vikash Sehwag
Saquib Sarfraz
Rainer Stiefelhagen
Lingjuan Lyu
Vivek Sharma
227
0
0
22 Oct 2024
Detecting Adversarial Examples
Detecting Adversarial Examples
Furkan Mumcu
Yasin Yilmaz
AAML
259
4
0
22 Oct 2024
Storyboard guided Alignment for Fine-grained Video Action Recognition
Storyboard guided Alignment for Fine-grained Video Action Recognition
Enqi Liu
Liyuan Pan
Yan Yang
Yiran Zhong
Zhijing Wu
Xinxiao Wu
Liu Liu
185
0
0
18 Oct 2024
Human Action Anticipation: A Survey
Human Action Anticipation: A Survey
Bolin Lai
Sam Toyer
Tushar Nagarajan
Rohit Girdhar
S. Zha
James M. Rehg
Kris Kitani
Kristen Grauman
Ruta Desai
Miao Liu
AI4TS
295
7
0
17 Oct 2024
MotionBank: A Large-scale Video Motion Benchmark with Disentangled
  Rule-based Annotations
MotionBank: A Large-scale Video Motion Benchmark with Disentangled Rule-based Annotations
Liang Xu
Shaoyang Hua
Zili Lin
Yifan Liu
Feipeng Ma
Yichao Yan
Xin Jin
Xiaokang Yang
Wenjun Zeng
VGen
252
13
0
17 Oct 2024
On-the-fly Modulation for Balanced Multimodal Learning
On-the-fly Modulation for Balanced Multimodal LearningIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
Yake Wei
D. Hu
Henghui Du
Ji-Rong Wen
233
28
0
15 Oct 2024
MoTE: Reconciling Generalization with Specialization for Visual-Language
  to Video Knowledge Transfer
MoTE: Reconciling Generalization with Specialization for Visual-Language to Video Knowledge TransferNeural Information Processing Systems (NeurIPS), 2024
Minghao Zhu
Zhengpu Wang
Mengxian Hu
Ronghao Dang
Xiao Lin
Xun Zhou
Chengju Liu
Qijun Chen
250
3
0
14 Oct 2024
Make the Pertinent Salient: Task-Relevant Reconstruction for Visual
  Control with Distractions
Make the Pertinent Salient: Task-Relevant Reconstruction for Visual Control with Distractions
Kyungmin Kim
JB Lanier
Pierre Baldi
Charless C. Fowlkes
Roy Fox
234
3
0
13 Oct 2024
Human Stone Toolmaking Action Grammar (HSTAG): A Challenging Benchmark
  for Fine-grained Motor Behavior Recognition
Human Stone Toolmaking Action Grammar (HSTAG): A Challenging Benchmark for Fine-grained Motor Behavior RecognitionInternational Conference on Data Science and Advanced Analytics (DSAA), 2024
Cheng Liu
Xuyang Yan
Zekun Zhang
Cheng Ding
Tianhao Zhao
Shaya Jannati
Cynthia Martinez
Dietrich Stout
130
1
0
10 Oct 2024
Evaluating Model Performance with Hard-Swish Activation Function
  Adjustments
Evaluating Model Performance with Hard-Swish Activation Function Adjustments
Sai Abhinav Pydimarry
Shekhar Madhav Khairnar
Sofia Garces Palacios
Ganesh Sankaranarayanan
Darian Hoagland
Dmitry Nepomnayshy
Huu Phong Nguyen
85
2
0
09 Oct 2024
Secure Video Quality Assessment Resisting Adversarial Attacks
Secure Video Quality Assessment Resisting Adversarial AttacksIEEE transactions on broadcasting (IEEE Trans. Broadcast.), 2024
Ao Zhang
Yu Ran
Weixuan Tang
Yuan-Gen Wang
Qingxiao Guan
Chunsheng Yang
AAML
251
1
0
09 Oct 2024
MTFL: Multi-Timescale Feature Learning for Weakly-Supervised Anomaly Detection in Surveillance Videos
MTFL: Multi-Timescale Feature Learning for Weakly-Supervised Anomaly Detection in Surveillance VideosInternational Conference on Machine Vision (ICMV), 2024
Yiling Zhang
Erkut Akdag
Egor Bondarev
Peter H. N. de With
AI4TSViT
195
5
0
08 Oct 2024
Towards World Simulator: Crafting Physical Commonsense-Based Benchmark
  for Video Generation
Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation
Fanqing Meng
Jiaqi Liao
Xinyu Tan
Wenqi Shao
Quanfeng Lu
Kaipeng Zhang
Yu Cheng
Dianqi Li
Yu Qiao
Ping Luo
VGenEGVM
256
66
0
07 Oct 2024
Bisimulation metric for Model Predictive Control
Bisimulation metric for Model Predictive ControlInternational Conference on Learning Representations (ICLR), 2024
Yutaka Shimizu
Masayoshi Tomizuka
211
2
0
06 Oct 2024
Linear Transformer Topological Masking with Graph Random Features
Linear Transformer Topological Masking with Graph Random FeaturesInternational Conference on Learning Representations (ICLR), 2024
Isaac Reid
Kumar Avinava Dubey
Deepali Jain
Will Whitney
Amr Ahmed
...
Connor Schenck
Richard E. Turner
René Wagner
Adrian Weller
Krzysztof Choromanski
283
4
0
04 Oct 2024
Computer-aided Colorization State-of-the-science: A Survey
Computer-aided Colorization State-of-the-science: A SurveyIEEE Transactions on Visualization and Computer Graphics (TVCG), 2024
Yu Cao
Xin Duan
Xiangqiao Meng
P. Y. Mok
Ping Li
Tong-Yee Lee
226
0
0
03 Oct 2024
An Evaluation of Large Pre-Trained Models for Gesture Recognition using
  Synthetic Videos
An Evaluation of Large Pre-Trained Models for Gesture Recognition using Synthetic Videos
Arun V. Reddy
Ketul Shah
Corban Rivera
William Paul
Celso M. De Melo
Rama Chellappa
SLR
147
1
0
03 Oct 2024
LLaVA-Video: Video Instruction Tuning With Synthetic Data
LLaVA-Video: Video Instruction Tuning With Synthetic Data
Yuanhan Zhang
Jinming Wu
W. Li
Bo Li
Zejun Ma
Ziwei Liu
Chunyuan Li
SyDaVGen
484
248
0
03 Oct 2024
COMUNI: Decomposing Common and Unique Video Signals for Diffusion-based
  Video Generation
COMUNI: Decomposing Common and Unique Video Signals for Diffusion-based Video Generation
Mingzhen Sun
Weining Wang
Xinxin Zhu
Jing Liu
VGenDiffM
174
0
0
02 Oct 2024
Tracking objects that change in appearance with phase synchrony
Tracking objects that change in appearance with phase synchronyInternational Conference on Learning Representations (ICLR), 2024
Sabine Muzellec
Drew Linsley
A. Ashok
E. Mingolla
Girik Malik
Rufin VanRullen
Thomas Serre
335
3
0
02 Oct 2024
Learnable Expansion of Graph Operators for Multi-Modal Feature Fusion
Learnable Expansion of Graph Operators for Multi-Modal Feature FusionInternational Conference on Learning Representations (ICLR), 2024
Dexuan Ding
Lei Wang
Liyun Zhu
Tom Gedeon
Piotr Koniusz
430
16
0
02 Oct 2024
Delving Deep into Engagement Prediction of Short Videos
Delving Deep into Engagement Prediction of Short VideosEuropean Conference on Computer Vision (ECCV), 2024
Dasong Li
Wenjie Li
Baili Lu
Hongsheng Li
Sizhuo Ma
Gurunandan Krishnan
Jian Wang
375
5
0
30 Sep 2024
REST-HANDS: Rehabilitation with Egocentric Vision Using Smartglasses for
  Treatment of Hands after Surviving Stroke
REST-HANDS: Rehabilitation with Egocentric Vision Using Smartglasses for Treatment of Hands after Surviving Stroke
Wiktor Mucha
Kentaro Tanaka
M. Kampel
236
0
0
30 Sep 2024
CycleCrash: A Dataset of Bicycle Collision Videos for Collision
  Prediction and Analysis
CycleCrash: A Dataset of Bicycle Collision Videos for Collision Prediction and AnalysisIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024
Nishq Poorav Desai
Ali Etemad
Michael A. Greenspan
303
4
0
30 Sep 2024
Fast Encoding and Decoding for Implicit Video Representation
Fast Encoding and Decoding for Implicit Video RepresentationEuropean Conference on Computer Vision (ECCV), 2024
Hao Chen
Saining Xie
Ser-Nam Lim
Abhinav Shrivastava
269
6
0
28 Sep 2024
Temporal2Seq: A Unified Framework for Temporal Video Understanding Tasks
Temporal2Seq: A Unified Framework for Temporal Video Understanding Tasks
Min Yang
Zichen Zhang
Limin Wang
AI4TS
215
0
0
27 Sep 2024
SOAR: Self-supervision Optimized UAV Action Recognition with Efficient
  Object-Aware Pretraining
SOAR: Self-supervision Optimized UAV Action Recognition with Efficient Object-Aware Pretraining
Ruiqi Xian
Xiyang Wu
Tianrui Guan
Xijun Wang
Boqing Gong
Dinesh Manocha
ViT
265
0
0
26 Sep 2024
Subjective and Objective Quality-of-Experience Evaluation Study for Live
  Video Streaming
Subjective and Objective Quality-of-Experience Evaluation Study for Live Video Streaming
Zehao Zhu
Wei Sun
Jun Jia
Wei Wu
Sibin Deng
Kai Li
Ying-Cong Chen
Xiongkuo Min
Jia Wang
Guangtao Zhai
162
0
0
26 Sep 2024
EAGLE: Egocentric AGgregated Language-video Engine
EAGLE: Egocentric AGgregated Language-video EngineACM Multimedia (MM), 2024
Jing Bi
Yunlong Tang
Luchuan Song
Ali Vosoughi
Nguyen Nguyen
Chenliang Xu
222
16
0
26 Sep 2024
Towards Synthetic Data Generation for Improved Pain Recognition in
  Videos under Patient Constraints
Towards Synthetic Data Generation for Improved Pain Recognition in Videos under Patient Constraints
Jonas Nasimzada
Jens Kleesiek
Ken Herrmann
Alina Roitberg
C. Seibold
133
1
0
24 Sep 2024
Self-Supervised Any-Point Tracking by Contrastive Random Walks
Self-Supervised Any-Point Tracking by Contrastive Random WalksEuropean Conference on Computer Vision (ECCV), 2024
Ayush Shrivastava
Andrew Owens
219
11
0
24 Sep 2024
Learning to Localize Actions in Instructional Videos with LLM-Based
  Multi-Pathway Text-Video Alignment
Learning to Localize Actions in Instructional Videos with LLM-Based Multi-Pathway Text-Video AlignmentEuropean Conference on Computer Vision (ECCV), 2024
Yuxiao Chen
Keqin Li
Wentao Bao
Deep Patel
Yu Kong
Martin Renqiang Min
Dimitris N. Metaxas
DiffM
291
5
0
22 Sep 2024
Previous
123...567...424344
Next