v1v2 (latest)

Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification

13 December 2017

Papers citing "Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification"

50 / 675 papers shown

Masked Autoencoder for Unsupervised Video Summarization

176

02 Jun 2023

Discovering Novel Actions from Open World Egocentric Videos with Object-Grounded Visual Commonsense ReasoningEuropean Conference on Computer Vision (ECCV), 2023

Sanjoy Kundu

Shubham Trehan

Sathyanarayanan N. Aakur

LRM LM&Ro

311

26 May 2023

Cross-view Action Recognition Understanding From Exocentric to Egocentric PerspectiveNeurocomputing (Neurocomputing), 2023

Thanh-Dat Truong

Khoa Luu

EgoV

393

25 May 2023

TG-VQA: Ternary Game of Video Question AnsweringInternational Joint Conference on Artificial Intelligence (IJCAI), 2023

Hao Li

241

17 May 2023

Lightweight Delivery Detection on Doorbell CamerasIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023

199

13 May 2023

Visual TuningACM Computing Surveys (ACM Comput. Surv.), 2023

...

440

10 May 2023

Improve Video Representation with Temporal Adversarial AugmentationInternational Joint Conference on Artificial Intelligence (IJCAI), 2023

244

28 Apr 2023

SSTM: Spatiotemporal Recurrent Transformers for Multi-frame Optical Flow EstimationNeurocomputing (Neurocomputing), 2023

Fisseha Admasu Ferede

M. Balasubramanian

135

26 Apr 2023

MRSN: Multi-Relation Support Network for Video Action DetectionIEEE International Conference on Multimedia and Expo (ICME), 2023

272

24 Apr 2023

Implicit Temporal Modeling with Learnable Alignment for Video RecognitionIEEE International Conference on Computer Vision (ICCV), 2023

Zuxuan Wu

313

20 Apr 2023

Pretrained Language Models as Visual Planners for Human AssistanceIEEE International Conference on Computer Vision (ICCV), 2023

Ruta Desai

355

17 Apr 2023

LASER: A Neuro-Symbolic Framework for Learning Spatial-Temporal Scene Graphs with Weak SupervisionInternational Conference on Learning Representations (ICLR), 2023

Jiani Huang

Ziyang Li

Mayur Naik

Ser-Nam Lim

678

15 Apr 2023

Zoom-VQA: Patches, Frames and Clips Integration for Video Quality Assessment

184

13 Apr 2023

AutoShot: A Short Video Dataset and State-of-the-Art Shot Boundary Detection

320

12 Apr 2023

Scallop: A Language for Neurosymbolic Programming

218

10 Apr 2023

Hyperspectral Image Super-Resolution via Dual-domain Network Based on Hybrid ConvolutionIEEE Transactions on Geoscience and Remote Sensing (TGRS), 2023

546

10 Apr 2023

SparseFormer: Sparse Visual Recognition via Limited Latent TokensInternational Conference on Learning Representations (ICLR), 2023

181

07 Apr 2023

Vita-CLIP: Video and text adaptive CLIP via Multimodal PromptingComputer Vision and Pattern Recognition (CVPR), 2023

Salman Khan

232

112

06 Apr 2023

Sketch-based Video Object LocalizationIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023

438

02 Apr 2023

DOAD: Decoupled One Stage Action Detection Network

Fan Wang

194

01 Apr 2023

Learning Procedure-aware Video Representation from Instructional Videos and Their NarrationsComputer Vision and Pattern Recognition (CVPR), 2023

253

31 Mar 2023

Streaming Video ModelComputer Vision and Pattern Recognition (CVPR), 2023

252

30 Mar 2023

What, when, and where? -- Self-Supervised Spatio-Temporal Grounding in Untrimmed Multi-Action Videos from Narrated InstructionsComputer Vision and Pattern Recognition (CVPR), 2023

363

29 Mar 2023

CycleACR: Cycle Modeling of Actor-Context Relations for Video Action DetectionIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023

Lei Chen

Zhan Tong

Yibing Song

Gangshan Wu

Limin Wang

195

28 Mar 2023

Unified Keypoint-based Action Recognition Framework via Structured Keypoint PoolingComputer Vision and Pattern Recognition (CVPR), 2023

220

27 Mar 2023

Learning Action Changes by Measuring Verb-Adverb Textual RelationshipsComputer Vision and Pattern Recognition (CVPR), 2023

309

27 Mar 2023

A Large-scale Study of Spatiotemporal Representation Learning with a New Benchmark on Action RecognitionIEEE International Conference on Computer Vision (ICCV), 2023

244

23 Mar 2023

Natural Language-Assisted Sign Language RecognitionComputer Vision and Pattern Recognition (CVPR), 2023

230

21 Mar 2023

Tubelet-Contrastive Self-Supervision for Video-Efficient GeneralizationIEEE International Conference on Computer Vision (ICCV), 2023

348

20 Mar 2023

Dual-path Adaptation from Image to Video TransformersComputer Vision and Pattern Recognition (CVPR), 2023

250

17 Mar 2023

Video Action Recognition with Attentive Semantic UnitsIEEE International Conference on Computer Vision (ICCV), 2023

Yifei Chen

Dapeng Chen

Ruijin Liu

Hao Li

Wei Peng

223

17 Mar 2023

CASP-Net: Rethinking Video Saliency Prediction from an Audio-VisualConsistency Perceptual PerspectiveComputer Vision and Pattern Recognition (CVPR), 2023

Wei Huang

Guangtao Zhai

164

11 Mar 2023

TQ-Net: Mixed Contrastive Representation Learning For Heterogeneous Test Questions

154

09 Mar 2023

Improving Video Retrieval by Adaptive MarginAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2021

295

09 Mar 2023

Text-Visual Prompting for Efficient 2D Temporal Video GroundingComputer Vision and Pattern Recognition (CVPR), 2023

274

09 Mar 2023

Continuity-Aware Latent Interframe Information Mining for Reliable UAV TrackingIEEE International Conference on Robotics and Automation (ICRA), 2023

250

08 Mar 2023

Continuous Sign Language Recognition with Correlation NetworkComputer Vision and Pattern Recognition (CVPR), 2023

363

116

06 Mar 2023

Maximizing Spatio-Temporal Entropy of Deep 3D CNNs for Efficient Video RecognitionInternational Conference on Learning Representations (ICLR), 2023

199

05 Mar 2023

Temporal Coherent Test-Time Optimization for Robust Video ClassificationInternational Conference on Learning Representations (ICLR), 2023

218

28 Feb 2023

Contrastive Video Question Answering via Video Graph TransformerIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023

Angela Yao

252

27 Feb 2023

Deep Learning for Video-Text Retrieval: a ReviewInternational Journal of Multimedia Information Retrieval (IJMIR), 2023

230

24 Feb 2023

STOA-VLP: Spatial-Temporal Modeling of Object and Action for Video-Language Pre-trainingAAAI Conference on Artificial Intelligence (AAAI), 2023

390

20 Feb 2023

Video Action Recognition Collaborative Learning with Dynamics via PSO-ConvNet TransformerScientific Reports (Sci Rep), 2023

N. H. Phong

B. Ribeiro

283

17 Feb 2023

CholecTriplet2022: Show me a tool and tell me the triplet -- an endoscopic vision challenge for surgical action triplet detection

...

256

13 Feb 2023

Efficient End-to-End Video Question Answering with Pyramidal Multimodal TransformerAAAI Conference on Artificial Intelligence (AAAI), 2023

246

04 Feb 2023

Learning Large-scale Neural Fields via Context Pruned Meta-LearningNeural Information Processing Systems (NeurIPS), 2023

Jonathan Richard Schwarz

311

01 Feb 2023

Tagging before Alignment: Integrating Multi-Modal Tags for Video-Text RetrievalAAAI Conference on Artificial Intelligence (AAAI), 2023

Ying Shan

257

30 Jan 2023

Semi-Parametric Video-Grounded Text Generation

250

27 Jan 2023

Revisiting Temporal Modeling for CLIP-based Image-to-Video Knowledge TransferringComputer Vision and Pattern Recognition (CVPR), 2023

262

26 Jan 2023

Gated-ViGAT: Efficient Bottom-Up Event Recognition and Explanation Using a New Frame Selection Policy and Gating MechanismIEEE International Symposium on Multimedia (ISM), 2022

Nikolaos Gkalelis

Dimitrios Daskalakis

Vasileios Mezaris

155

18 Jan 2023