The Kinetics Human Action Video Dataset

19 May 2017

Sudheendra Vijayanarasimhan

Papers citing "The Kinetics Human Action Video Dataset"

50 / 2,152 papers shown

PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance

243

04 Nov 2024

SPECTRUM: Semantic Processing and Emotion-informed video-Captioning Through Retrieval and Understanding Modalities

Ehsan Faghihi

Mohammedreza Zarenejad

Ali-Asghar Beheshti Shirazi

271

04 Nov 2024

Constrained Human-AI Cooperation: An Inclusive Embodied Social Intelligence ChallengeNeural Information Processing Systems (NeurIPS), 2024

...

471

04 Nov 2024

ROAD-Waymo: Action Awareness at Scale for Autonomous Driving

Salman Khan

Izzeddin Teeti

Reza Javanmard Alitappeh

292

03 Nov 2024

STAA: Spatio-Temporal Attention Attribution for Real-Time Interpreting Transformer-based Video Models

Zerui Wang

Yan Liu

312

01 Nov 2024

Learning Video Representations without Natural Videos

247

31 Oct 2024

Video Token Merging for Long-form Video Understanding

290

31 Oct 2024

Deep Convolutional Neural Networks on Multiclass Classification of Three-Dimensional Brain Images for Parkinson's Disease Stage Prediction

183

31 Oct 2024

EchoFM: Foundation Model for Generalizable Echocardiogram AnalysisIEEE Transactions on Medical Imaging (IEEE TMI), 2024

Yiwei Li

265

30 Oct 2024

AtGCN: A Graph Convolutional Network For Ataxic Gait Detection

Karan Bania

Tanmay Verlekar

117

30 Oct 2024

Multi-Level Feature Distillation of Joint Teachers Trained on Distinct Image DatasetsIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024

Adrian Iordache

B. Alexe

Radu Tudor Ionescu

304

29 Oct 2024

Analytic Continual Test-Time Adaptation for Multi-Modality Corruption

212

29 Oct 2024

USpeech: Ultrasound-Enhanced Speech with Minimal Human Effort via Cross-Modal SynthesisProceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies (IMWUT), 2024

198

29 Oct 2024

LiGAR: LiDAR-Guided Hierarchical Transformer for Multi-Modal Group Activity RecognitionIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024

N. V. R. Chappa

Khoa Luu

177

28 Oct 2024

BlinkVision: A Benchmark for Optical Flow, Scene Flow and Point Tracking Estimation using RGB Frames and EventsEuropean Conference on Computer Vision (ECCV), 2024

...

Hongsheng Li

402

27 Oct 2024

NeuroClips: Towards High-fidelity and Smooth fMRI-to-Video ReconstructionNeural Information Processing Systems (NeurIPS), 2024

...

Yu Zhang

256

25 Oct 2024

Low-Latency Video Anonymization for Crowd Anomaly Detection: Privacy vs. Performance

M. Asres

Lei Jiao

C. Omlin

208

24 Oct 2024

AlphaChimp: Tracking and Behavior Recognition of Chimpanzees

449

22 Oct 2024

Masked Differential Privacy

227

22 Oct 2024

Detecting Adversarial Examples

Furkan Mumcu

Yasin Yilmaz

AAML

259

22 Oct 2024

Storyboard guided Alignment for Fine-grained Video Action Recognition

Yan Yang

185

18 Oct 2024

Human Action Anticipation: A Survey

295

17 Oct 2024

MotionBank: A Large-scale Video Motion Benchmark with Disentangled Rule-based Annotations

Yichao Yan

Xiaokang Yang

252

17 Oct 2024

On-the-fly Modulation for Balanced Multimodal LearningIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024

233

15 Oct 2024

MoTE: Reconciling Generalization with Specialization for Visual-Language to Video Knowledge TransferNeural Information Processing Systems (NeurIPS), 2024

250

14 Oct 2024

Make the Pertinent Salient: Task-Relevant Reconstruction for Visual Control with Distractions

234

13 Oct 2024

Human Stone Toolmaking Action Grammar (HSTAG): A Challenging Benchmark for Fine-grained Motor Behavior RecognitionInternational Conference on Data Science and Advanced Analytics (DSAA), 2024

130

10 Oct 2024

Evaluating Model Performance with Hard-Swish Activation Function Adjustments

Sai Abhinav Pydimarry

Shekhar Madhav Khairnar

Sofia Garces Palacios

Ganesh Sankaranarayanan

Darian Hoagland

Dmitry Nepomnayshy

Huu Phong Nguyen

09 Oct 2024

Secure Video Quality Assessment Resisting Adversarial AttacksIEEE transactions on broadcasting (IEEE Trans. Broadcast.), 2024

251

09 Oct 2024

MTFL: Multi-Timescale Feature Learning for Weakly-Supervised Anomaly Detection in Surveillance VideosInternational Conference on Machine Vision (ICMV), 2024

195

08 Oct 2024

Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation

Kaipeng Zhang

Yu Cheng

Dianqi Li

Yu Qiao

Ping Luo

VGen EGVM

256

07 Oct 2024

Bisimulation metric for Model Predictive ControlInternational Conference on Learning Representations (ICLR), 2024

Yutaka Shimizu

Masayoshi Tomizuka

211

06 Oct 2024

Linear Transformer Topological Masking with Graph Random FeaturesInternational Conference on Learning Representations (ICLR), 2024

...

Richard E. Turner

Adrian Weller

Krzysztof Choromanski

283

04 Oct 2024

Computer-aided Colorization State-of-the-science: A SurveyIEEE Transactions on Visualization and Computer Graphics (TVCG), 2024

P. Y. Mok

226

03 Oct 2024

An Evaluation of Large Pre-Trained Models for Gesture Recognition using Synthetic Videos

Corban Rivera

Rama Chellappa

147

03 Oct 2024

LLaVA-Video: Video Instruction Tuning With Synthetic Data

484

248

03 Oct 2024

COMUNI: Decomposing Common and Unique Video Signals for Diffusion-based Video Generation

Mingzhen Sun

Weining Wang

Xinxin Zhu

Jing Liu

VGen DiffM

174

02 Oct 2024

Tracking objects that change in appearance with phase synchronyInternational Conference on Learning Representations (ICLR), 2024

Thomas Serre

335

02 Oct 2024

Learnable Expansion of Graph Operators for Multi-Modal Feature FusionInternational Conference on Learning Representations (ICLR), 2024

Tom Gedeon

430

02 Oct 2024

Delving Deep into Engagement Prediction of Short VideosEuropean Conference on Computer Vision (ECCV), 2024

Wenjie Li

Hongsheng Li

Jian Wang

375

30 Sep 2024

REST-HANDS: Rehabilitation with Egocentric Vision Using Smartglasses for Treatment of Hands after Surviving Stroke

Wiktor Mucha

Kentaro Tanaka

M. Kampel

236

30 Sep 2024

CycleCrash: A Dataset of Bicycle Collision Videos for Collision Prediction and AnalysisIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024

Nishq Poorav Desai

Ali Etemad

Michael A. Greenspan

303

30 Sep 2024

Fast Encoding and Decoding for Implicit Video RepresentationEuropean Conference on Computer Vision (ECCV), 2024

Hao Chen

Saining Xie

Ser-Nam Lim

Abhinav Shrivastava

269

28 Sep 2024

Temporal2Seq: A Unified Framework for Temporal Video Understanding Tasks

Min Yang

Zichen Zhang

Limin Wang

AI4TS

215

27 Sep 2024

SOAR: Self-supervision Optimized UAV Action Recognition with Efficient Object-Aware Pretraining

Xijun Wang

Dinesh Manocha

265

26 Sep 2024

Subjective and Objective Quality-of-Experience Evaluation Study for Live Video Streaming

Wei Sun

Xiongkuo Min

Guangtao Zhai

162

26 Sep 2024

EAGLE: Egocentric AGgregated Language-video EngineACM Multimedia (MM), 2024

222

26 Sep 2024

Towards Synthetic Data Generation for Improved Pain Recognition in Videos under Patient Constraints

133

24 Sep 2024

Self-Supervised Any-Point Tracking by Contrastive Random WalksEuropean Conference on Computer Vision (ECCV), 2024

Ayush Shrivastava

Andrew Owens

219

24 Sep 2024

Learning to Localize Actions in Instructional Videos with LLM-Based Multi-Pathway Text-Video AlignmentEuropean Conference on Computer Vision (ECCV), 2024

Yu Kong

Martin Renqiang Min

Dimitris N. Metaxas

DiffM

291

22 Sep 2024