ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.06950
  4. Cited By
The Kinetics Human Action Video Dataset

The Kinetics Human Action Video Dataset

19 May 2017
W. Kay
João Carreira
Karen Simonyan
Brian Zhang
Chloe Hillier
Sudheendra Vijayanarasimhan
Fabio Viola
Tim Green
T. Back
Apostol Natsev
Mustafa Suleyman
Andrew Zisserman
ArXiv (abs)PDFHTML

Papers citing "The Kinetics Human Action Video Dataset"

50 / 2,152 papers shown
Video-based surgical skill assessment using 3D convolutional neural
  networks
Video-based surgical skill assessment using 3D convolutional neural networksInternational Journal of Computer Assisted Radiology and Surgery (IJCARS), 2019
Isabel Funke
S. T. Mees
Jürgen Weitz
Stefanie Speidel
272
213
0
06 Mar 2019
KT-Speech-Crawler: Automatic Dataset Construction for Speech Recognition
  from YouTube Videos
KT-Speech-Crawler: Automatic Dataset Construction for Speech Recognition from YouTube VideosConference on Empirical Methods in Natural Language Processing (EMNLP), 2018
Egor Lakomkin
S. Magg
C. Weber
S. Wermter
113
20
0
01 Mar 2019
STAR-Net: Action Recognition using Spatio-Temporal Activation
  Reprojection
STAR-Net: Action Recognition using Spatio-Temporal Activation ReprojectionCanadian Conference on Computer and Robot Vision (CRV), 2019
William J. McNally
A. Wong
J. McPhee
HAI3DH
116
29
0
26 Feb 2019
Self-supervised Visual Feature Learning with Deep Neural Networks: A
  Survey
Self-supervised Visual Feature Learning with Deep Neural Networks: A Survey
Longlong Jing
Yingli Tian
SSL
416
1,906
0
16 Feb 2019
Anomaly Locality in Video Surveillance
Anomaly Locality in Video Surveillance
Federico Landi
Cees G. M. Snoek
Rita Cucchiara
127
64
0
29 Jan 2019
Spatio-temporal Action Recognition: A Survey
Spatio-temporal Action Recognition: A Survey
Amlaan Bhoi
73
14
0
27 Jan 2019
DistInit: Learning Video Representations Without a Single Labeled Video
DistInit: Learning Video Representations Without a Single Labeled Video
Rohit Girdhar
Du Tran
Lorenzo Torresani
Deva Ramanan
198
58
0
26 Jan 2019
Audio-Visual Scene-Aware Dialog
Audio-Visual Scene-Aware Dialog
Huda AlAmri
Vincent Cartillier
Abhishek Das
Jue Wang
A. Cherian
...
Tim K. Marks
Chiori Hori
Peter Anderson
Stefan Lee
Devi Parikh
VGen
277
213
0
25 Jan 2019
DMC-Net: Generating Discriminative Motion Cues for Fast Compressed Video
  Action Recognition
DMC-Net: Generating Discriminative Motion Cues for Fast Compressed Video Action Recognition
Zheng Shou
Xudong Lin
Yannis Kalantidis
Laura Sevilla-Lara
Marcus Rohrbach
Shih-Fu Chang
Zhicheng Yan
VGen
262
129
0
11 Jan 2019
Cricket stroke extraction: Towards creation of a large-scale cricket
  actions dataset
Cricket stroke extraction: Towards creation of a large-scale cricket actions dataset
Arpan Gupta
S. Muthiah
113
7
0
10 Jan 2019
Mutual Context Network for Jointly Estimating Egocentric Gaze and
  Actions
Mutual Context Network for Jointly Estimating Egocentric Gaze and Actions
Yifei Huang
Zhenqiang Li
Minjie Cai
Yoichi Sato
EgoV
276
82
0
07 Jan 2019
Action2Vec: A Crossmodal Embedding Approach to Action Learning
Action2Vec: A Crossmodal Embedding Approach to Action Learning
Meera Hahn
Andrew Silva
James M. Rehg
189
59
0
02 Jan 2019
Actor Conditioned Attention Maps for Video Action Detection
Actor Conditioned Attention Maps for Video Action Detection
Oytun Ulutan
S. Rallapalli
Mudhakar Srivatsa
Carlos Torres
B. S. Manjunath
136
49
0
30 Dec 2018
Class-Aware Adversarial Lung Nodule Synthesis in CT Images
Class-Aware Adversarial Lung Nodule Synthesis in CT Images
J. Yang
Siqi Liu
Sasa Grbic
A. Setio
Zhoubing Xu
Eli Gibson
G. Chabin
Bogdan Georgescu
Andrew F. Laine
Dorin Comaniciu
MedImGAN
251
29
0
28 Dec 2018
D3D: Distilled 3D Networks for Video Action Recognition
D3D: Distilled 3D Networks for Video Action Recognition
Jonathan C. Stroud
David A. Ross
Chen Sun
Gaowen Liu
Rahul Sukthankar
3DPC
183
179
0
19 Dec 2018
From FiLM to Video: Multi-turn Question Answering with Multi-modal
  Context
From FiLM to Video: Multi-turn Question Answering with Multi-modal Context
T. Nguyen
Shikhar Sharma
Hannes Schulz
Layla El Asri
134
34
0
17 Dec 2018
Improving the Performance of Unimodal Dynamic Hand-Gesture Recognition with Multimodal Training
Improving the Performance of Unimodal Dynamic Hand-Gesture Recognition with Multimodal Training
Mahdi Abavisani
Hamid Reza Vaezi Joze
Vishal M. Patel
228
148
0
14 Dec 2018
Adversarial Inference for Multi-Sentence Video Description
Adversarial Inference for Multi-Sentence Video Description
J. S. Park
Marcus Rohrbach
Trevor Darrell
Anna Rohrbach
248
89
0
13 Dec 2018
Nrityantar: Pose oblivious Indian classical dance sequence
  classification system
Nrityantar: Pose oblivious Indian classical dance sequence classification system
V. Kaushik
Prerana Mukherjee
Brejesh Lall
77
9
0
13 Dec 2018
SlowFast Networks for Video Recognition
SlowFast Networks for Video Recognition
Christoph Feichtenhofer
Haoqi Fan
Jitendra Malik
Kaiming He
548
3,839
0
10 Dec 2018
Weakly Supervised Dense Event Captioning in Videos
Weakly Supervised Dense Event Captioning in Videos
Xuguang Duan
Wen-bing Huang
Chuang Gan
Jingdong Wang
Wenwu Zhu
Junzhou Huang
168
164
0
10 Dec 2018
Video Action Transformer Network
Video Action Transformer Network
Rohit Girdhar
João Carreira
Carl Doersch
Andrew Zisserman
ViT
352
751
0
06 Dec 2018
Decompose to manipulate: Manipulable Object Synthesis in 3D Medical
  Images with Structured Image Decomposition
Decompose to manipulate: Manipulable Object Synthesis in 3D Medical Images with Structured Image Decomposition
Siqi Liu
Eli Gibson
Sasa Grbic
Zhoubing Xu
A. Setio
J. Yang
Bogdan Georgescu
Dorin Comaniciu
DiffMMedIm
269
18
0
04 Dec 2018
The Visual Centrifuge: Model-Free Layered Video Representations
The Visual Centrifuge: Model-Free Layered Video Representations
Jean-Baptiste Alayrac
João Carreira
Andrew Zisserman
180
49
0
04 Dec 2018
Timeception for Complex Action Recognition
Timeception for Complex Action Recognition
Noureldien Hussein
E. Gavves
A. Smeulders
266
229
0
04 Dec 2018
Towards Accurate Generative Models of Video: A New Metric & Challenges
Towards Accurate Generative Models of Video: A New Metric & Challenges
Thomas Unterthiner
Sjoerd van Steenkiste
Karol Kurach
Raphaël Marinier
Marcin Michalski
Sylvain Gelly
EGVMVGen
789
1,032
0
03 Dec 2018
Multi-modal Capsule Routing for Actor and Action Video Segmentation
  Conditioned on Natural Language Queries
Multi-modal Capsule Routing for Actor and Action Video Segmentation Conditioned on Natural Language Queries
Bruce McIntosh
Kevin Duarte
Yogesh S Rawat
M. Shah
MedIm
133
17
0
02 Dec 2018
Graph-Based Global Reasoning Networks
Graph-Based Global Reasoning Networks
Yunpeng Chen
Marcus Rohrbach
Zhicheng Yan
Shuicheng Yan
Jiashi Feng
Yannis Kalantidis
GNNNAI
485
493
0
30 Nov 2018
Iterative Projection and Matching: Finding Structure-preserving
  Representatives and Its Application to Computer Vision
Iterative Projection and Matching: Finding Structure-preserving Representatives and Its Application to Computer Vision
M. Joneidi
Alireza Zaeemzadeh
Nazanin Rahnavard
M. Shah
125
18
0
29 Nov 2018
Unsupervised Meta-Learning For Few-Shot Image Classification
Unsupervised Meta-Learning For Few-Shot Image Classification
Siavash Khodadadeh
Ladislau Bölöni
M. Shah
SSLVLM
230
156
0
28 Nov 2018
Self-Supervised Spatiotemporal Feature Learning via Video Rotation
  Prediction
Self-Supervised Spatiotemporal Feature Learning via Video Rotation Prediction
Longlong Jing
Xiaodong Yang
Jingen Liu
Yingli Tian
191
165
0
28 Nov 2018
Uncertainty aware audiovisual activity recognition using deep Bayesian
  variational inference
Uncertainty aware audiovisual activity recognition using deep Bayesian variational inference
Mahesh Subedar
R. Krishnan
P. López-Meyer
Omesh Tickoo
Jonathan Huang
BDLEDLUQCV
182
0
0
27 Nov 2018
Evolving Space-Time Neural Architectures for Videos
Evolving Space-Time Neural Architectures for Videos
A. Piergiovanni
A. Angelova
Alexander Toshev
Michael S. Ryoo
179
60
0
26 Nov 2018
Stacked Spatio-Temporal Graph Convolutional Networks for Action
  Segmentation
Stacked Spatio-Temporal Graph Convolutional Networks for Action Segmentation
P. Ghosh
Yi Yao
L. Davis
Ajay Divakaran
344
90
0
26 Nov 2018
Temporal Bilinear Networks for Video Action Recognition
Temporal Bilinear Networks for Video Action RecognitionAAAI Conference on Artificial Intelligence (AAAI), 2018
Yanghao Li
Sijie Song
Yuqi Li
Jiaying Liu
124
34
0
25 Nov 2018
RGB-D Based Action Recognition with Light-weight 3D Convolutional
  Networks
RGB-D Based Action Recognition with Light-weight 3D Convolutional Networks
Haokui Zhang
Ying Li
Peng Wang
Yu Liu
Chunhua Shen
3DPC
215
11
0
24 Nov 2018
Self-Supervised Video Representation Learning with Space-Time Cubic
  Puzzles
Self-Supervised Video Representation Learning with Space-Time Cubic PuzzlesAAAI Conference on Artificial Intelligence (AAAI), 2018
Dahun Kim
Donghyeon Cho
In So Kweon
SSL
252
363
0
24 Nov 2018
Learning from Multiview Correlations in Open-Domain Videos
Learning from Multiview Correlations in Open-Domain VideosIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2018
Nils Holzenberger
Shruti Palaskar
Pranava Madhyastha
Florian Metze
R. Arora
SSL
134
11
0
21 Nov 2018
MAC: Mining Activity Concepts for Language-based Temporal Localization
MAC: Mining Activity Concepts for Language-based Temporal LocalizationIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2018
Runzhou Ge
J. Gao
Kan Chen
Ram Nevatia
184
194
0
21 Nov 2018
TSM: Temporal Shift Module for Efficient Video Understanding
TSM: Temporal Shift Module for Efficient Video UnderstandingIEEE International Conference on Computer Vision (ICCV), 2018
Ji Lin
Chuang Gan
Song Han
632
1,934
0
20 Nov 2018
Multi-Task Learning of Generalizable Representations for Video Action
  Recognition
Multi-Task Learning of Generalizable Representations for Video Action RecognitionIEEE International Conference on Multimedia and Expo (ICME), 2018
Zhiyu Yao
Yunbo Wang
Mingsheng Long
Jianmin Wang
Philip S Yu
Jiaguang Sun
75
3
0
20 Nov 2018
Segregated Temporal Assembly Recurrent Networks for Weakly Supervised
  Multiple Action Detection
Segregated Temporal Assembly Recurrent Networks for Weakly Supervised Multiple Action DetectionAAAI Conference on Artificial Intelligence (AAAI), 2018
Yunlu Xu
Chengwei Zhang
Zhanzhan Cheng
Jianwen Xie
Yi Niu
Shiliang Pu
Leilei Gan
222
83
0
19 Nov 2018
Recurrent Convolutions for Causal 3D CNNs
Recurrent Convolutions for Causal 3D CNNs
Gurkirt Singh
Fabio Cuzzolin
3DPC
131
0
0
17 Nov 2018
Natural Environment Benchmarks for Reinforcement Learning
Natural Environment Benchmarks for Reinforcement Learning
Amy Zhang
Yuxin Wu
Joelle Pineau
OffRLOOD
183
69
0
14 Nov 2018
Skeleton-Based Action Recognition with Synchronous Local and Non-local
  Spatio-temporal Learning and Frequency Attention
Skeleton-Based Action Recognition with Synchronous Local and Non-local Spatio-temporal Learning and Frequency Attention
Guyue Hu
Bo Cui
Shan Yu
229
42
0
10 Nov 2018
Identify, locate and separate: Audio-visual object extraction in large
  video collections using weak supervision
Identify, locate and separate: Audio-visual object extraction in large video collections using weak supervisionIEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2018
Sanjeel Parekh
A. Ozerov
S. Essid
Ngoc Q. K. Duong
P. Pérez
G. Richard
119
16
0
09 Nov 2018
Cross and Learn: Cross-Modal Self-Supervision
Cross and Learn: Cross-Modal Self-SupervisionGerman Conference on Pattern Recognition (DAGM), 2018
Nawid Sayed
Biagio Brattoli
Bjorn Ommer
SSL
250
83
0
09 Nov 2018
Multimodal Grounding for Sequence-to-Sequence Speech Recognition
Multimodal Grounding for Sequence-to-Sequence Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2018
Ozan Caglayan
Ramon Sanabria
Shruti Palaskar
Loïc Barrault
Florian Metze
149
25
0
09 Nov 2018
BAR: Bayesian Activity Recognition using variational inference
BAR: Bayesian Activity Recognition using variational inference
R. Krishnan
Mahesh Subedar
S. Bhatnagar
BDLUQCV
259
22
0
08 Nov 2018
Learning to Compose Topic-Aware Mixture of Experts for Zero-Shot Video
  Captioning
Learning to Compose Topic-Aware Mixture of Experts for Zero-Shot Video CaptioningAAAI Conference on Artificial Intelligence (AAAI), 2018
Yoonchang Sung
Jiawei Wu
Da Zhang
Yu-Chuan Su
Erfaun Noorani
224
39
0
07 Nov 2018
Previous
123...4041424344
Next