ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.06950
  4. Cited By
The Kinetics Human Action Video Dataset

The Kinetics Human Action Video Dataset

19 May 2017
W. Kay
João Carreira
Karen Simonyan
Brian Zhang
Chloe Hillier
Sudheendra Vijayanarasimhan
Fabio Viola
Tim Green
T. Back
Apostol Natsev
Mustafa Suleyman
Andrew Zisserman
ArXivPDFHTML

Papers citing "The Kinetics Human Action Video Dataset"

50 / 2,017 papers shown
Title
Neurons: Emulating the Human Visual Cortex Improves Fidelity and Interpretability in fMRI-to-Video Reconstruction
Haonan Wang
Qixiang Zhang
Lehan Wang
Xuanqi Huang
Xiaomeng Li
VOS
VGen
62
0
0
14 Mar 2025
KVQ: Boosting Video Quality Assessment via Saliency-guided Local Perception
Yunpeng Qu
Kun Yuan
Qizhi Xie
Ming-Ting Sun
Chao Zhou
Jian Wang
75
1
0
13 Mar 2025
4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models
4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models
Wanhua Li
Renping Zhou
Jiawei Zhou
Yingwei Song
Johannes Herter
Minghan Qin
Gao Huang
Hanspeter Pfister
3DGS
VLM
68
0
0
13 Mar 2025
STEAD: Spatio-Temporal Efficient Anomaly Detection for Time and Compute Sensitive Applications
Andrew Gao
Jun Liu
AI4TS
58
0
0
11 Mar 2025
HERO: Human Reaction Generation from Videos
Chengjun Yu
Wei-dong Zhai
Yuhang Yang
Yang Cao
Zheng-jun Zha
VGen
56
0
0
11 Mar 2025
TimeLoc: A Unified End-to-End Framework for Precise Timestamp Localization in Long Videos
Chen-Da Liu-Zhang
Lin Sui
Shuming Liu
Fangzhou Mu
Ziyi Wang
Bernard Ghanem
57
1
0
09 Mar 2025
Secure On-Device Video OOD Detection Without Backpropagation
Secure On-Device Video OOD Detection Without Backpropagation
Li Li
Peilin Cai
Yuxiao Zhou
Zhiyu Ni
Renjie Liang
You Qin
Yi Nian
Zhuowen Tu
Xiyang Hu
Yue Zhao
OODD
FedML
71
2
0
08 Mar 2025
End-to-End Action Segmentation Transformer
Tieqiao Wang
Sinisa Todorovic
ViT
39
0
0
08 Mar 2025
Semi-Supervised Audio-Visual Video Action Recognition with Audio Source Localization Guided Mixup
Seokun Kang
Taehwan Kim
42
0
0
04 Mar 2025
Attention Bootstrapping for Multi-Modal Test-Time Adaptation
Yusheng Zhao
Junyu Luo
Xiao Luo
Jinsheng Huang
Jingyang Yuan
Zhiping Xiao
M. Zhang
TTA
92
0
0
04 Mar 2025
Exploring Simple Siamese Network for High-Resolution Video Quality Assessment
Guotao Shen
Ziheng Yan
Xin Jin
Longhai Wu
Jie Chen
Ilhyun Cho
Cheul-hee Hahm
47
0
0
04 Mar 2025
HarmonySet: A Comprehensive Dataset for Understanding Video-Music Semantic Alignment and Temporal Synchronization
Zitang Zhou
Ke Mei
Yu Lu
Tianyi Wang
Fengyun Rao
94
2
0
03 Mar 2025
Modeling Fine-Grained Hand-Object Dynamics for Egocentric Video Representation Learning
Baoqi Pei
Yuanmin Huang
Jilan Xu
Guo Chen
Yuping He
...
Yali Wang
Weidi Xie
Yu Qiao
Fei Wu
Limin Wang
46
1
0
02 Mar 2025
Two-Stream Spatial-Temporal Transformer Framework for Person Identification via Natural Conversational Keypoints
Two-Stream Spatial-Temporal Transformer Framework for Person Identification via Natural Conversational Keypoints
Masoumeh Chapariniya
Hossein Ranjbar
Teodora Vukovic
Sarah Ebling
Volker Dellwo
3DPC
51
0
0
28 Feb 2025
AgroLLM: Connecting Farmers and Agricultural Practices through Large Language Models for Enhanced Knowledge Transfer and Practical Application
Dinesh Jackson Samuel
Inna Skarga-Bandurova
David Sikolia
Muhammad Awais
55
0
0
28 Feb 2025
Subtask-Aware Visual Reward Learning from Segmented Demonstrations
Subtask-Aware Visual Reward Learning from Segmented Demonstrations
Changyeon Kim
Minho Heo
Doohyun Lee
Jinwoo Shin
Honglak Lee
Joseph J. Lim
Kimin Lee
44
1
0
28 Feb 2025
The PanAf-FGBG Dataset: Understanding the Impact of Backgrounds in Wildlife Behaviour Recognition
The PanAf-FGBG Dataset: Understanding the Impact of Backgrounds in Wildlife Behaviour Recognition
Otto Brookes
Maksim Kukushkin
Majid Mirmehdi
Colleen Stephens
Paula Dieguez
...
Lukas Boesch
Thomas Schmid
M. Arandjelovic
H. Kühl
T. Burghardt
50
0
0
28 Feb 2025
Learning to Generalize without Bias for Open-Vocabulary Action Recognition
Learning to Generalize without Bias for Open-Vocabulary Action Recognition
Yating Yu
Congqi Cao
Yifan Zhang
Yanning Zhang
VLM
45
0
0
27 Feb 2025
OpenTAD: A Unified Framework and Comprehensive Study of Temporal Action Detection
OpenTAD: A Unified Framework and Comprehensive Study of Temporal Action Detection
Shuming Liu
Chen Zhao
Fatimah Zohra
Mattia Soldan
Alejandro Pardo
...
Juan Carlos León Alcázar
A. Cioppa
Silvio Giancola
Carlos Hinojosa
Bernard Ghanem
70
3
0
27 Feb 2025
Balanced Representation Learning for Long-tailed Skeleton-based Action Recognition
Balanced Representation Learning for Long-tailed Skeleton-based Action Recognition
Hongda Liu
Yunlong Wang
Min Ren
Junxing Hu
Zhengquan Luo
Guangqi Hou
Zhe Sun
55
0
0
24 Feb 2025
Fine-Grained Video Captioning through Scene Graph Consolidation
Fine-Grained Video Captioning through Scene Graph Consolidation
Sanghyeok Chu
Seonguk Seo
Bohyung Han
63
1
0
23 Feb 2025
Black Sheep in the Herd: Playing with Spuriously Correlated Attributes for Vision-Language Recognition
Xinyu Tian
Shu Zou
Zhaoyuan Yang
Mengqi He
Jing Zhang
VLM
53
0
0
19 Feb 2025
MALT Diffusion: Memory-Augmented Latent Transformers for Any-Length Video Generation
MALT Diffusion: Memory-Augmented Latent Transformers for Any-Length Video Generation
Sihyun Yu
Meera Hahn
Dan Kondratyuk
Jinwoo Shin
Agrim Gupta
José Lezama
Irfan Essa
David A. Ross
Jonathan Huang
DiffM
VGen
80
0
0
18 Feb 2025
EgoSpeak: Learning When to Speak for Egocentric Conversational Agents in the Wild
EgoSpeak: Learning When to Speak for Egocentric Conversational Agents in the Wild
Junhyeok Kim
Min Soo Kim
Jiwan Chung
Jungbin Cho
Jisoo Kim
Sungwoong Kim
Gyeongbo Sim
Youngjae Yu
EgoV
60
0
0
17 Feb 2025
Improving action segmentation via explicit similarity measurement
Improving action segmentation via explicit similarity measurement
Kamel Aouaidjia
Wenhao Zhang
Aofan Li
Chongsheng Zhang
44
0
0
15 Feb 2025
Janus: Collaborative Vision Transformer Under Dynamic Network Environment
Janus: Collaborative Vision Transformer Under Dynamic Network Environment
Linyi Jiang
Silvery Fu
Yifei Zhu
Bo Li
ViT
248
0
0
14 Feb 2025
Learning Human Skill Generators at Key-Step Levels
Learning Human Skill Generators at Key-Step Levels
Yilu Wu
Chenhui Zhu
Shuai Wang
Hanlin Wang
Jing Wang
Zhaoxiang Zhang
Limin Wang
VGen
134
0
0
12 Feb 2025
A Survey on Mamba Architecture for Vision Applications
A Survey on Mamba Architecture for Vision Applications
Fady Ibrahim
Guangjun Liu
Guanghui Wang
Mamba
66
3
0
11 Feb 2025
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis
Amir Hosein Fadaei
M. Dehaqani
50
0
0
11 Feb 2025
History-Guided Video Diffusion
Kiwhan Song
Boyuan Chen
Max Simchowitz
Yilun Du
Russ Tedrake
Vincent Sitzmann
VGen
121
7
0
10 Feb 2025
Survey on AI-Generated Media Detection: From Non-MLLM to MLLM
Survey on AI-Generated Media Detection: From Non-MLLM to MLLM
Yueying Zou
Peipei Li
Zekun Li
Huaibo Huang
Xing Cui
Xuannan Liu
Chenghanyu Zhang
Ran He
DeLMO
132
3
0
07 Feb 2025
BRIDLE: Generalized Self-supervised Learning with Quantization
BRIDLE: Generalized Self-supervised Learning with Quantization
Hoang M. Nguyen
Satya Narayan Shukla
Qiang Zhang
Hanchao Yu
Sreya D. Roy
Taipeng Tian
Lingjiong Zhu
Yuchen Liu
SSL
MQ
84
0
0
04 Feb 2025
Video Latent Flow Matching: Optimal Polynomial Projections for Video Interpolation and Extrapolation
Video Latent Flow Matching: Optimal Polynomial Projections for Video Interpolation and Extrapolation
Yang Cao
Zhao Song
Chiwun Yang
VGen
60
2
0
01 Feb 2025
Can masking background and object reduce static bias for zero-shot action recognition?
Can masking background and object reduce static bias for zero-shot action recognition?
Takumi Fukuzawa
Kensho Hara
Hirokatsu Kataoka
Toru Tamaki
43
1
0
22 Jan 2025
InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling
InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling
Yi Wang
Xinhao Li
Ziang Yan
Yinan He
Jiashuo Yu
...
Kai Chen
Wenhai Wang
Yu Qiao
Yali Wang
Limin Wang
93
26
0
21 Jan 2025
Human Activity Recognition in an Open World
Human Activity Recognition in an Open World
D. Prijatelj
Samuel Grieggs
Jin Huang
Dawei Du
Ameya Shringi
Christopher Funk
Adam Kaufman
Eric Robertson
Walter J. Scheirer University of Notre Dame
80
3
0
17 Jan 2025
A Multimodal Dataset for Enhancing Industrial Task Monitoring and Engagement Prediction
A Multimodal Dataset for Enhancing Industrial Task Monitoring and Engagement Prediction
Naval Kishore Mehta
Arvind
Himanshu Kumar
Abeer Banerjee
Sumeet Saurav
Sanjay Singh
54
0
0
10 Jan 2025
Differentiable Task Graph Learning: Procedural Activity Representation and Online Mistake Detection from Egocentric Videos
Differentiable Task Graph Learning: Procedural Activity Representation and Online Mistake Detection from Egocentric Videos
Luigi Seminara
G. Farinella
Antonino Furnari
69
8
0
10 Jan 2025
MS-Temba : Multi-Scale Temporal Mamba for Efficient Temporal Action Detection
MS-Temba : Multi-Scale Temporal Mamba for Efficient Temporal Action Detection
Arkaprava Sinha
Monish Soundar Raj
Pu Wang
Ahmed Helmy
Srijan Das
Mamba
66
3
0
10 Jan 2025
MLVU: Benchmarking Multi-task Long Video Understanding
MLVU: Benchmarking Multi-task Long Video Understanding
Yueze Wang
Yan Shu
Bo Zhao
Boya Wu
Junjie Zhou
...
Xi Yang
Y. Xiong
Bo Zhang
Tiejun Huang
Zheng Liu
VLM
63
33
0
03 Jan 2025
Cross-Modal Fusion and Attention Mechanism for Weakly Supervised Video Anomaly Detection
Cross-Modal Fusion and Attention Mechanism for Weakly Supervised Video Anomaly Detection
Ayush Ghadiya
P. Kar
Vishal M. Chudasama
Pankaj Wasnik
54
1
0
31 Dec 2024
VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling
VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling
Xinhao Li
Yi Wang
Jiashuo Yu
Xiangyu Zeng
Yuhan Zhu
...
Yinan He
Chenting Wang
Yu Qiao
Yali Wang
L. Wang
VLM
89
26
0
31 Dec 2024
Finger in Camera Speaks Everything: Unconstrained Air-Writing for
  Real-World
Finger in Camera Speaks Everything: Unconstrained Air-Writing for Real-World
Meiqi Wu
Kaiqi Huang
Yuanqiang Cai
Shiyu Hu
Yuzhong Zhao
Weiqiang Wang
VGen
45
1
0
27 Dec 2024
Sensitive Image Classification by Vision Transformers
Sensitive Image Classification by Vision Transformers
Hanxian He
Campbell Wilson
Thanh Thi Nguyen
Janis Dalins
ViT
89
0
0
21 Dec 2024
LEARN: A Unified Framework for Multi-Task Domain Adapt Few-Shot Learning
LEARN: A Unified Framework for Multi-Task Domain Adapt Few-Shot Learning
Bharadwaj Ravichandran
Alexander Lynch
S. Brockman
Brandon RichardWebster
Dawei Du
A. Hoogs
Christopher Funk
ObjD
VLM
83
0
0
20 Dec 2024
Query-centric Audio-Visual Cognition Network for Moment Retrieval,
  Segmentation and Step-Captioning
Query-centric Audio-Visual Cognition Network for Moment Retrieval, Segmentation and Step-Captioning
Yunbin Tu
Liang-Sheng Li
Li Su
Qingming Huang
87
0
0
18 Dec 2024
JoVALE: Detecting Human Actions in Video Using Audiovisual and Language Contexts
JoVALE: Detecting Human Actions in Video Using Audiovisual and Language Contexts
Taein Son
Soo Won Seo
Jisong Kim
S. Lee
Jun Won Choi
VGen
79
0
0
18 Dec 2024
Do Language Models Understand Time?
Do Language Models Understand Time?
Xi Ding
Lei Wang
184
0
0
18 Dec 2024
Gramian Multimodal Representation Learning and Alignment
Gramian Multimodal Representation Learning and Alignment
Giordano Cicchetti
Eleonora Grassucci
Luigi Sigillo
Danilo Comminiello
102
1
0
16 Dec 2024
Training Strategies for Isolated Sign Language Recognition
Training Strategies for Isolated Sign Language Recognition
Karina Kvanchiani
Roman Kraynov
Elizaveta Petrova
Petr Surovcev
Aleksandr Nagaev
A. Kapitanov
89
1
0
16 Dec 2024
Previous
12345...394041
Next