ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.06950
  4. Cited By
The Kinetics Human Action Video Dataset

The Kinetics Human Action Video Dataset

19 May 2017
W. Kay
João Carreira
Karen Simonyan
Brian Zhang
Chloe Hillier
Sudheendra Vijayanarasimhan
Fabio Viola
Tim Green
T. Back
Apostol Natsev
Mustafa Suleyman
Andrew Zisserman
ArXiv (abs)PDFHTML

Papers citing "The Kinetics Human Action Video Dataset"

50 / 2,152 papers shown
Meet JEANIE: a Similarity Measure for 3D Skeleton Sequences via
  Temporal-Viewpoint Alignment
Meet JEANIE: a Similarity Measure for 3D Skeleton Sequences via Temporal-Viewpoint Alignment
Lei Wang
Jun Liu
Liang Zheng
Tom Gedeon
Piotr Koniusz
269
18
0
07 Feb 2024
Boosting Adversarial Transferability across Model Genus by
  Deformation-Constrained Warping
Boosting Adversarial Transferability across Model Genus by Deformation-Constrained WarpingAAAI Conference on Artificial Intelligence (AAAI), 2024
Qinliang Lin
Cheng Luo
Zenghao Niu
Xilin He
Weicheng Xie
Yuanbo Hou
Linlin Shen
Siyang Song
AAML
268
25
0
06 Feb 2024
VLN-Video: Utilizing Driving Videos for Outdoor Vision-and-Language
  Navigation
VLN-Video: Utilizing Driving Videos for Outdoor Vision-and-Language NavigationAAAI Conference on Artificial Intelligence (AAAI), 2024
Jialu Li
Aishwarya Padmakumar
Gaurav Sukhatme
Mohit Bansal
316
10
0
05 Feb 2024
Video-LaVIT: Unified Video-Language Pre-training with Decoupled
  Visual-Motional Tokenization
Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional TokenizationInternational Conference on Machine Learning (ICML), 2024
Yang Jin
Zhicheng Sun
Kun Xu
Kun Xu
Liwei Chen
...
Yuliang Liu
Chen Zhang
Yang Song
Kun Gai
Yadong Mu
VGen
250
77
0
05 Feb 2024
Taylor Videos for Action Recognition
Taylor Videos for Action RecognitionInternational Conference on Machine Learning (ICML), 2024
Lei Wang
Xiuyuan Yuan
Tom Gedeon
Liang Zheng
541
13
0
05 Feb 2024
Time-, Memory- and Parameter-Efficient Visual Adaptation
Time-, Memory- and Parameter-Efficient Visual AdaptationComputer Vision and Pattern Recognition (CVPR), 2024
Otniel-Bogdan Mercea
Alexey Gritsenko
Cordelia Schmid
Anurag Arnab
VLM
191
22
0
05 Feb 2024
Classification of Tennis Actions Using Deep Learning
Classification of Tennis Actions Using Deep Learning
Emil Hovad
Therese Hougaard-Jensen
L. H. Clemmensen
70
6
0
04 Feb 2024
Region-Based Representations Revisited
Region-Based Representations Revisited
Michal Shlapentokh-Rothman
Ansel Blume
Yao Xiao
Yuqun Wu
TV Sethuraman
Heyi Tao
Jae Yong Lee
Wilfredo Torres
Yu-Xiong Wang
Derek Hoiem
480
14
0
04 Feb 2024
NeuroCine: Decoding Vivid Video Sequences from Human Brain Activties
NeuroCine: Decoding Vivid Video Sequences from Human Brain Activties
Jingyuan Sun
Mingxiao Li
Zijiao Chen
Marie-Francine Moens
VGen
255
13
0
02 Feb 2024
A Survey on Generative AI and LLM for Video Generation, Understanding,
  and Streaming
A Survey on Generative AI and LLM for Video Generation, Understanding, and Streaming
Pengyuan Zhou
Lin Wang
Zhi Liu
Yanbin Hao
Pan Hui
Sasu Tarkoma
J. Kangasharju
VGen
252
46
0
30 Jan 2024
Computer Vision for Primate Behavior Analysis in the Wild
Computer Vision for Primate Behavior Analysis in the Wild
Richard Vogg
Timo Lüddecke
Jonathan Henrich
Sharmita Dey
Matthias Nuske
...
Alexander Gail
Stefan Treue
H. Scherberger
Florentin Wörgötter
Alexander S. Ecker
400
15
0
29 Jan 2024
MV2MAE: Multi-View Video Masked Autoencoders
MV2MAE: Multi-View Video Masked Autoencoders
Ketul Shah
Robert Crandall
Jie Xu
Peng Zhou
Marian George
Mayank Bansal
Rama Chellappa
247
6
0
29 Jan 2024
Multi-model learning by sequential reading of untrimmed videos for
  action recognition
Multi-model learning by sequential reading of untrimmed videos for action recognition
Kodai Kamiya
Toru Tamaki
255
0
0
26 Jan 2024
Multimodal Pathway: Improve Transformers with Irrelevant Data from Other
  Modalities
Multimodal Pathway: Improve Transformers with Irrelevant Data from Other ModalitiesComputer Vision and Pattern Recognition (CVPR), 2024
Yiyuan Zhang
Xiaohan Ding
Kaixiong Gong
Yixiao Ge
Ying Shan
Xiangyu Yue
ViT
309
11
0
25 Jan 2024
PanAf20K: A Large Video Dataset for Wild Ape Detection and Behaviour
  Recognition
PanAf20K: A Large Video Dataset for Wild Ape Detection and Behaviour RecognitionInternational Journal of Computer Vision (IJCV), 2024
Otto Brookes
Majid Mirmehdi
Colleen Stephens
Samuel Angedakin
Katherine Corogenes
...
Klaus Zuberbühler
Christophe Boesch
M. Arandjelovic
H. Kühl
T. Burghardt
219
30
0
24 Jan 2024
Interleaving One-Class and Weakly-Supervised Models with Adaptive
  Thresholding for Unsupervised Video Anomaly Detection
Interleaving One-Class and Weakly-Supervised Models with Adaptive Thresholding for Unsupervised Video Anomaly DetectionEuropean Conference on Computer Vision (ECCV), 2024
Yongwei Nie
Hao Huang
Chengjiang Long
Qing Zhang
Pradipta Maji
Hongmin Cai
263
6
0
24 Jan 2024
Deep Learning for Computer Vision based Activity Recognition and Fall
  Detection of the Elderly: a Systematic Review
Deep Learning for Computer Vision based Activity Recognition and Fall Detection of the Elderly: a Systematic Review
F. X. Gaya-Morey
Cristina Manresa-Yee
Jose Maria Buades Rubio
163
47
0
22 Jan 2024
ActionHub: A Large-scale Action Video Description Dataset for Zero-shot
  Action Recognition
ActionHub: A Large-scale Action Video Description Dataset for Zero-shot Action Recognition
Jiaming Zhou
Junwei Liang
Kun-Yu Lin
Jinrui Yang
Wei-Shi Zheng
VLM
298
13
0
22 Jan 2024
M2-CLIP: A Multimodal, Multi-task Adapting Framework for Video Action
  Recognition
M2-CLIP: A Multimodal, Multi-task Adapting Framework for Video Action RecognitionAAAI Conference on Artificial Intelligence (AAAI), 2024
Mengmeng Wang
Jiazheng Xing
Boyuan Jiang
Jun Chen
Jianbiao Mei
Xingxing Zuo
Guang Dai
Jingdong Wang
Yong-Jin Liu
VLM
204
8
0
22 Jan 2024
Detecting Multimedia Generated by Large AI Models: A Survey
Detecting Multimedia Generated by Large AI Models: A Survey
Li Lin
Neeraj Gupta
Yue Zhang
Hainan Ren
Chun-Hao Liu
Feng Ding
Xin Eric Wang
Xin Li
Luisa Verdoliva
Shu Hu
877
89
0
22 Jan 2024
Exploring Missing Modality in Multimodal Egocentric Datasets
Exploring Missing Modality in Multimodal Egocentric Datasets
Merey Ramazanova
Alejandro Pardo
Humam Alwassel
Guohao Li
EgoV
299
7
0
21 Jan 2024
Adversarial Augmentation Training Makes Action Recognition Models More Robust to Realistic Video Distribution Shifts
Adversarial Augmentation Training Makes Action Recognition Models More Robust to Realistic Video Distribution ShiftsInternational Conferences on Pattern Recognition and Artificial Intelligence (ICCPRAI), 2024
Kiyoon Kim
Shreyank N. Gowda
Panagiotis Eustratiadis
Antreas Antoniou
Robert B Fisher
362
2
0
21 Jan 2024
Deep Reinforcement Learning Empowered Activity-Aware Dynamic Health
  Monitoring Systems
Deep Reinforcement Learning Empowered Activity-Aware Dynamic Health Monitoring Systems
Ziqiang Ye
Yulan Gao
Yue Xiao
Zehui Xiong
Dusit Niyato
63
2
0
19 Jan 2024
GPT4Ego: Unleashing the Potential of Pre-trained Models for Zero-Shot
  Egocentric Action Recognition
GPT4Ego: Unleashing the Potential of Pre-trained Models for Zero-Shot Egocentric Action Recognition
Guangzhao Dai
Xiangbo Shu
Wenhao Wu
Rui Yan
Jiachao Zhang
VLM
425
9
0
18 Jan 2024
Depth Over RGB: Automatic Evaluation of Open Surgery Skills Using Depth
  Camera
Depth Over RGB: Automatic Evaluation of Open Surgery Skills Using Depth Camera
Ido Zuckerman
Nicole Werner
Jonathan Kouchly
Emma Huston
Shannon DiMarco
Paul D Dimusto
S. Laufer
167
3
0
18 Jan 2024
From Coarse to Fine: Efficient Training for Audio Spectrogram
  Transformers
From Coarse to Fine: Efficient Training for Audio Spectrogram TransformersIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Jiu Feng
Mehmet Hamza Erol
Joon Son Chung
Arda Senocak
154
2
0
16 Jan 2024
Transformer-based Video Saliency Prediction with High Temporal Dimension
  Decoding
Transformer-based Video Saliency Prediction with High Temporal Dimension Decoding
Morteza Moradi
S. Palazzo
C. Spampinato
191
8
0
15 Jan 2024
FiGCLIP: Fine-Grained CLIP Adaptation via Densely Annotated Videos
FiGCLIP: Fine-Grained CLIP Adaptation via Densely Annotated Videos
S. DarshanSingh
Zeeshan Khan
Makarand Tapaswi
VLMCLIP
198
6
0
15 Jan 2024
Collaboratively Self-supervised Video Representation Learning for Action Recognition
Collaboratively Self-supervised Video Representation Learning for Action RecognitionIEEE Transactions on Information Forensics and Security (IEEE TIFS), 2024
Jie Zhang
Zhifan Wan
Lanqing Hu
Stephen Lin
Shuzhe Wu
Shiguang Shan
TTA
375
2
0
15 Jan 2024
Hierarchical Augmentation and Distillation for Class Incremental
  Audio-Visual Video Recognition
Hierarchical Augmentation and Distillation for Class Incremental Audio-Visual Video RecognitionIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
Yukun Zuo
Hantao Yao
Liansheng Zhuang
Changsheng Xu
320
5
0
11 Jan 2024
HaltingVT: Adaptive Token Halting Transformer for Efficient Video
  Recognition
HaltingVT: Adaptive Token Halting Transformer for Efficient Video RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Qian Wu
Ruoxuan Cui
Yuke Li
Haoqi Zhu
ViT
225
5
0
10 Jan 2024
Dr$^2$Net: Dynamic Reversible Dual-Residual Networks for
  Memory-Efficient Finetuning
Dr2^22Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient FinetuningComputer Vision and Pattern Recognition (CVPR), 2024
Chen Zhao
Shuming Liu
K. Mangalam
Guocheng Qian
Fatimah Zohra
Abdulmohsen Alghannam
Jitendra Malik
Guohao Li
229
8
0
08 Jan 2024
Efficient Multiscale Multimodal Bottleneck Transformer for Audio-Video
  Classification
Efficient Multiscale Multimodal Bottleneck Transformer for Audio-Video Classification
Wentao Zhu
268
7
0
08 Jan 2024
Efficient Selective Audio Masked Multimodal Bottleneck Transformer for
  Audio-Video Classification
Efficient Selective Audio Masked Multimodal Bottleneck Transformer for Audio-Video Classification
Wentao Zhu
154
5
0
08 Jan 2024
MERBench: A Unified Evaluation Benchmark for Multimodal Emotion
  Recognition
MERBench: A Unified Evaluation Benchmark for Multimodal Emotion Recognition
Zheng Lian
Guoying Zhao
Yong Ren
Hao Gu
Haiyang Sun
Lan Chen
Yinan Han
Jianhua Tao
406
26
0
07 Jan 2024
Efficient Bitrate Ladder Construction using Transfer Learning and
  Spatio-Temporal Features
Efficient Bitrate Ladder Construction using Transfer Learning and Spatio-Temporal Features
A. Falahati
Mohammad Karim Safavi
Ardavan Elahi
Farhad Pakdaman
Moncef Gabbouj
AI4TS
153
2
0
06 Jan 2024
Subjective and Objective Analysis of Indian Social Media Video Quality
Subjective and Objective Analysis of Indian Social Media Video Quality
Sandeep Mishra
Mukul Jha
A. Bovik
206
1
0
05 Jan 2024
SAR-RARP50: Segmentation of surgical instrumentation and Action
  Recognition on Robot-Assisted Radical Prostatectomy Challenge
SAR-RARP50: Segmentation of surgical instrumentation and Action Recognition on Robot-Assisted Radical Prostatectomy Challenge
Dimitrios Psychogyios
Emanuele Colleoni
Beatrice van Amsterdam
Chih-Yang Li
Shu-Yu Huang
...
Santiago Rodriguez
Juanita Puentes
Pablo Arbelaez
Omid Mohareri
Danail Stoyanov
199
38
0
31 Dec 2023
Masked Modeling for Self-supervised Representation Learning on Vision
  and Beyond
Masked Modeling for Self-supervised Representation Learning on Vision and Beyond
Siyuan Li
Luyuan Zhang
Zedong Wang
Di Wu
Lirong Wu
...
Jun Xia
Cheng Tan
Yang Liu
Baigui Sun
Stan Z. Li
SSL
299
28
0
31 Dec 2023
A Large-Scale Re-identification Analysis in Sporting Scenarios: the
  Betrayal of Reaching a Critical Point
A Large-Scale Re-identification Analysis in Sporting Scenarios: the Betrayal of Reaching a Critical Point
David Freire-Obregón
J. Lorenzo-Navarro
Oliverio J. Santana
Daniel Hernández-Sosa
Modesto Castrillón-Santana
CVBM
185
4
0
29 Dec 2023
Multiscale Vision Transformers meet Bipartite Matching for efficient
  single-stage Action Localization
Multiscale Vision Transformers meet Bipartite Matching for efficient single-stage Action LocalizationComputer Vision and Pattern Recognition (CVPR), 2023
Ioanna Ntinou
Enrique Sanchez
Georgios Tzimiropoulos
254
7
0
29 Dec 2023
Video Understanding with Large Language Models: A Survey
Video Understanding with Large Language Models: A Survey
Yunlong Tang
Jing Bi
Siting Xu
Luchuan Song
Susan Liang
...
Feng Zheng
Jianguo Zhang
Chenliang Xu
Jiebo Luo
Chenliang Xu
VLM
707
163
0
29 Dec 2023
3DTINC: Time-Equivariant Non-Contrastive Learning for Predicting Disease
  Progression from Longitudinal OCTs
3DTINC: Time-Equivariant Non-Contrastive Learning for Predicting Disease Progression from Longitudinal OCTs
T. Emre
A. Chakravarty
Antoine Rivail
Dmitrii Lachinov
Oliver Leingang
...
S. Sivaprasad
Daniel Rueckert
A. Lotery
U. Schmidt-Erfurth
Hrvoje Bogunović
MedIm
245
8
0
28 Dec 2023
Deformable Audio Transformer for Audio Event Detection
Deformable Audio Transformer for Audio Event Detection
Wentao Zhu
153
0
0
24 Dec 2023
Classifying Soccer Ball-on-Goal Position Through Kicker Shooting Action
Classifying Soccer Ball-on-Goal Position Through Kicker Shooting Action
Javier Torón-Artiles
Daniel Hernández-Sosa
Oliverio J. Santana
J. Lorenzo-Navarro
David Freire-Obregón
124
3
0
23 Dec 2023
Video Recognition in Portrait Mode
Video Recognition in Portrait Mode
Mingfei Han
Linjie Yang
Xiaojie Jin
Jiashi Feng
Xiaojun Chang
Heng Wang
212
6
0
21 Dec 2023
Bootstrap Masked Visual Modeling via Hard Patches Mining
Bootstrap Masked Visual Modeling via Hard Patches Mining
Haochen Wang
Junsong Fan
Yuxi Wang
Kaiyou Song
Tiancai Wang
Xiangyu Zhang
Zhaoxiang Zhang
227
6
0
21 Dec 2023
SADA: Semantic adversarial unsupervised domain adaptation for Temporal Action Localization
SADA: Semantic adversarial unsupervised domain adaptation for Temporal Action Localization
David Pujol-Perich
Albert Clapés
Sergio Escalera
620
1
0
20 Dec 2023
Collaborative Weakly Supervised Video Correlation Learning for
  Procedure-Aware Instructional Video Analysis
Collaborative Weakly Supervised Video Correlation Learning for Procedure-Aware Instructional Video Analysis
Tianyao He
Huabin Liu
Yuxi Li
Xiao Ma
Cheng Zhong
Yang Zhang
Weiyao Lin
302
7
0
18 Dec 2023
Traffic Incident Database with Multiple Labels Including Various
  Perspective Environmental Information
Traffic Incident Database with Multiple Labels Including Various Perspective Environmental Information
Shota Nishiyama
Takuma Saito
Ryo Nakamura
Go Ohtani
Hirokatsu Kataoka
Kensho Hara
165
0
0
17 Dec 2023
Previous
123...101112...424344
Next