ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1812.03982
  4. Cited By
SlowFast Networks for Video Recognition

SlowFast Networks for Video Recognition

10 December 2018
Christoph Feichtenhofer
Haoqi Fan
Jitendra Malik
Kaiming He
ArXivPDFHTML

Papers citing "SlowFast Networks for Video Recognition"

50 / 503 papers shown
Title
DiGIT: Multi-Dilated Gated Encoder and Central-Adjacent Region Integrated Decoder for Temporal Action Detection Transformer
DiGIT: Multi-Dilated Gated Encoder and Central-Adjacent Region Integrated Decoder for Temporal Action Detection Transformer
Ho-Joong Kim
Y. E. Lee
Jung-Ho Hong
Seong-Whan Lee
40
0
0
09 May 2025
Reducing Annotation Burden in Physical Activity Research Using Vision-Language Models
Reducing Annotation Burden in Physical Activity Research Using Vision-Language Models
Abram Schonfeldt
Benjamin Maylor
Xiaofang Chen
Ronald Clark
Aiden Doherty
68
0
0
06 May 2025
Breaking Annotation Barriers: Generalized Video Quality Assessment via Ranking-based Self-Supervision
Breaking Annotation Barriers: Generalized Video Quality Assessment via Ranking-based Self-Supervision
Linhan Cao
Wei Sun
Kaiwei Zhang
Yicong Peng
Guangtao Zhai
Xiongkuo Min
52
0
0
06 May 2025
Automated ARAT Scoring Using Multimodal Video Analysis, Multi-View Fusion, and Hierarchical Bayesian Models: A Clinician Study
Automated ARAT Scoring Using Multimodal Video Analysis, Multi-View Fusion, and Hierarchical Bayesian Models: A Clinician Study
Tamim Ahmed
Thanassis Rikakis
29
0
0
03 May 2025
Vehicular Communication Security: Multi-Channel and Multi-Factor Authentication
Vehicular Communication Security: Multi-Channel and Multi-Factor Authentication
Marco De Vincenzi
S.
Chen Bo Calvin Zhang
Manuel Garcia
Shaozu Ding
Chiara Bodei
Ilaria Matteucci
Dajiang Suo
Dajiang Suo
49
0
0
01 May 2025
Advance Fake Video Detection via Vision Transformers
Advance Fake Video Detection via Vision Transformers
Joy Battocchio
S. Dell’Anna
Andrea Montibeller
Giulia Boato
ViT
VGen
34
0
0
29 Apr 2025
Beyond the Horizon: Decoupling UAVs Multi-View Action Recognition via Partial Order Transfer
Beyond the Horizon: Decoupling UAVs Multi-View Action Recognition via Partial Order Transfer
Wenxuan Liu
X. Zhong
Zhuo Zhou
S. Yang
Chia-Wen Lin
Alex Chichung Kot
32
0
0
29 Apr 2025
Learning Streaming Video Representation via Multitask Training
Learning Streaming Video Representation via Multitask Training
Yibin Yan
Jilan Xu
Shangzhe Di
Yikun Liu
Yudi Shi
Qirui Chen
Zeqian Li
Yifei Huang
Weidi Xie
CLL
84
0
0
28 Apr 2025
STCOcc: Sparse Spatial-Temporal Cascade Renovation for 3D Occupancy and Scene Flow Prediction
STCOcc: Sparse Spatial-Temporal Cascade Renovation for 3D Occupancy and Scene Flow Prediction
Zhimin Liao
Ping Wei
Shuaijia Chen
Haoxuan Wang
Ziyang Ren
100
0
0
28 Apr 2025
ActionArt: Advancing Multimodal Large Models for Fine-Grained Human-Centric Video Understanding
ActionArt: Advancing Multimodal Large Models for Fine-Grained Human-Centric Video Understanding
Yi-Xing Peng
Q. Yang
Yu-Ming Tang
Shenghao Fu
Kun-Yu Lin
Xihan Wei
Wei-Shi Zheng
42
0
0
25 Apr 2025
HierSum: A Global and Local Attention Mechanism for Video Summarization
HierSum: A Global and Local Attention Mechanism for Video Summarization
Apoorva Beedu
Irfan Essa
67
0
0
25 Apr 2025
We'll Fix it in Post: Improving Text-to-Video Generation with Neuro-Symbolic Feedback
We'll Fix it in Post: Improving Text-to-Video Generation with Neuro-Symbolic Feedback
Minkyu Choi
Sundar Sripada V. S.
Harsh Goel
Sahil Shah
Sandeep P. Chinchali
DiffM
VGen
91
0
0
24 Apr 2025
NTIRE 2025 Challenge on Short-form UGC Video Quality Assessment and Enhancement: Methods and Results
NTIRE 2025 Challenge on Short-form UGC Video Quality Assessment and Enhancement: Methods and Results
Xin Li
Kun Yuan
B. Li
Fengbin Guan
Yizhen Shao
...
Guohua Zhang
Z. Huang
Y. Deng
Qingmiao Jiang
Lu Chen
55
7
0
17 Apr 2025
F$^3$Set: Towards Analyzing Fast, Frequent, and Fine-grained Events from Videos
F3^33Set: Towards Analyzing Fast, Frequent, and Fine-grained Events from Videos
Zhaoyu Liu
Kan Jiang
Murong Ma
Zhé Hóu
Yun Lin
J. Dong
37
0
0
11 Apr 2025
Post-processing for Fair Regression via Explainable SVD
Post-processing for Fair Regression via Explainable SVD
Zhiqun Zuo
Ding Zhu
Mohammad Mahdi Khalili
146
0
0
04 Apr 2025
A Survey on Music Generation from Single-Modal, Cross-Modal, and Multi-Modal Perspectives
A Survey on Music Generation from Single-Modal, Cross-Modal, and Multi-Modal Perspectives
Shuyu Li
Shulei Ji
Zihao W. Wang
Songruoyao Wu
Jiaxing Yu
K. Zhang
MGen
VGen
70
1
0
01 Apr 2025
Mamba-3D as Masked Autoencoders for Accurate and Data-Efficient Analysis of Medical Ultrasound Videos
Mamba-3D as Masked Autoencoders for Accurate and Data-Efficient Analysis of Medical Ultrasound Videos
Jiaheng Zhou
Yanfeng Zhou
Wei Fang
Yuxing Tang
Le Lu
Ge Yang
Mamba
199
0
0
26 Mar 2025
Action tube generation by person query matching for spatio-temporal action detection
Action tube generation by person query matching for spatio-temporal action detection
Kazuki Omi
Jion Oshima
Toru Tamaki
60
0
0
17 Mar 2025
A Large-Scale Study on Video Action Dataset Condensation
A Large-Scale Study on Video Action Dataset Condensation
Yang Chen
Sheng Guo
Bo Zheng
Limin Wang
DD
77
2
0
13 Mar 2025
Analysis of 3D Urticaceae Pollen Classification Using Deep Learning Models
Tijs Konijn
Imaan Bijl
Lu Cao
Fons Verbeek
53
0
0
10 Mar 2025
Modeling Fine-Grained Hand-Object Dynamics for Egocentric Video Representation Learning
Baoqi Pei
Y. Huang
Jilan Xu
Guo Chen
Yuping He
...
Yali Wang
Weidi Xie
Yu Qiao
Fei Wu
Limin Wang
41
0
0
02 Mar 2025
MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval
MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval
Haoran Tang
Meng Cao
Jinfa Huang
Ruyang Liu
Peng Jin
Ge Li
Xiaodan Liang
Mamba
96
4
0
24 Feb 2025
Myna: Masking-Based Contrastive Learning of Musical Representations
Myna: Masking-Based Contrastive Learning of Musical Representations
Ori Yonay
Tracy Hammond
Tianbao Yang
AAML
58
0
0
20 Feb 2025
SurgPLAN++: Universal Surgical Phase Localization Network for Online and Offline Inference
SurgPLAN++: Universal Surgical Phase Localization Network for Online and Offline Inference
Zhen Chen
Xingjian Luo
Jinlin Wu
Long Bai
Zhen Lei
Hongliang Ren
Sebastien Ourselin
Hongbin Liu
61
0
0
17 Feb 2025
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis
Amir Hosein Fadaei
M. Dehaqani
42
0
0
11 Feb 2025
Conformal Predictions for Human Action Recognition with Vision-Language Models
Conformal Predictions for Human Action Recognition with Vision-Language Models
Bary Tim
Fuchs Clément
Macq Benoît
VLM
46
0
0
10 Feb 2025
An object detection approach for lane change and overtake detection from motion profiles
An object detection approach for lane change and overtake detection from motion profiles
Andrea Benericetti
Niccolò Bellaccini
Henrique Piñeiro Monteagudo
Matteo Simoncini
Francesco Sambo
68
0
0
06 Feb 2025
LD-DETR: Loop Decoder DEtection TRansformer for Video Moment Retrieval and Highlight Detection
LD-DETR: Loop Decoder DEtection TRansformer for Video Moment Retrieval and Highlight Detection
Pengcheng Zhao
Zhixian He
Fuwei Zhang
Shujin Lin
Fan Zhou
42
1
0
18 Jan 2025
V-Trans4Style: Visual Transition Recommendation for Video Production Style Adaptation
V-Trans4Style: Visual Transition Recommendation for Video Production Style Adaptation
P. Guhan
Tsung-Wei Huang
Guan-Ming Su
Subhadra Gopalakrishnan
Dinesh Manocha
VGen
63
0
0
14 Jan 2025
A Simple Recipe for Contrastively Pre-training Video-First Encoders Beyond 16 Frames
A Simple Recipe for Contrastively Pre-training Video-First Encoders Beyond 16 Frames
Pinelopi Papalampidi
Skanda Koppula
Shreya Pathak
Justin T Chiu
Joseph Heyward
Viorica Patraucean
Jiajun Shen
Antoine Miech
Andrew Zisserman
Aida Nematzdeh
VLM
60
24
0
31 Dec 2024
Interacted Object Grounding in Spatio-Temporal Human-Object Interactions
Interacted Object Grounding in Spatio-Temporal Human-Object Interactions
Xiaoyang Liu
Boran Wen
Xinpeng Liu
Zizheng Zhou
Hongwei Fan
Cewu Lu
Lizhuang Ma
Yulong Chen
Y. Li
56
2
0
27 Dec 2024
Do Language Models Understand Time?
Do Language Models Understand Time?
Xi Ding
Lei Wang
178
0
0
18 Dec 2024
Progress-Aware Video Frame Captioning
Progress-Aware Video Frame Captioning
Zihui Xue
Joungbin An
Xitong Yang
Kristen Grauman
100
1
0
03 Dec 2024
Human-Activity AGV Quality Assessment: A Benchmark Dataset and an Objective Evaluation Metric
Human-Activity AGV Quality Assessment: A Benchmark Dataset and an Objective Evaluation Metric
Zhichao Zhang
Wei Sun
Xinyue Li
Yunhao Li
Qihang Ge
...
Zhongpeng Ji
Fengyu Sun
Shangling Jui
Xiongkuo Min
Guangtao Zhai
EGVM
117
1
0
25 Nov 2024
Principles of Visual Tokens for Efficient Video Understanding
Principles of Visual Tokens for Efficient Video Understanding
Xinyue Hao
Gen Li
Shreyank N. Gowda
Robert B Fisher
Jonathan Huang
Anurag Arnab
Laura Sevilla-Lara
98
0
0
20 Nov 2024
Situational Scene Graph for Structured Human-centric Situation Understanding
Situational Scene Graph for Structured Human-centric Situation Understanding
Chinthani Sugandhika
Chen Li
Deepu Rajan
Basura Fernando
146
1
0
30 Oct 2024
Secure Video Quality Assessment Resisting Adversarial Attacks
Secure Video Quality Assessment Resisting Adversarial Attacks
Ao Zhang
Yu Ran
Weixuan Tang
Yuan-Gen Wang
Qingxiao Guan
Chunsheng Yang
AAML
29
0
0
09 Oct 2024
Anchors Aweigh! Sail for Optimal Unified Multi-Modal Representations
Anchors Aweigh! Sail for Optimal Unified Multi-Modal Representations
Minoh Jeong
Min Namgung
Zae Myung Kim
Dongyeop Kang
Yao-Yi Chiang
Alfred Hero
25
0
0
02 Oct 2024
Spacewalker: Traversing Representation Spaces for Fast Interactive Exploration and Annotation of Unstructured Data
Spacewalker: Traversing Representation Spaces for Fast Interactive Exploration and Annotation of Unstructured Data
Lukas Heine
Fabian Horst
Jana Fragemann
Gijs Luijten
M. Balzer
Jan Egger
F. Bahnsen
M. Sarfraz
Jens Kleesiek
25
0
0
25 Sep 2024
Ctrl-GenAug: Controllable Generative Augmentation for Medical Sequence Classification
Ctrl-GenAug: Controllable Generative Augmentation for Medical Sequence Classification
Xinrui Zhou
Yuhao Huang
Haoran Dou
Shijing Chen
Ao Chang
...
Jie Jessie Ren
Ruobing Huang
Jun Cheng
Wufeng Xue
Dong Ni
MedIm
132
0
0
25 Sep 2024
Introducing Gating and Context into Temporal Action Detection
Introducing Gating and Context into Temporal Action Detection
Aglind Reka
Diana Laura Borza
Dominick Reilly
Michal Balazia
Francois Bremond
20
0
0
06 Sep 2024
Self-Supervised Contrastive Learning for Videos using Differentiable Local Alignment
Self-Supervised Contrastive Learning for Videos using Differentiable Local Alignment
Keyne Oei
Amr Gomaa
Anna Maria Feit
João Belo
28
0
0
06 Sep 2024
Towards Student Actions in Classroom Scenes: New Dataset and Baseline
Towards Student Actions in Classroom Scenes: New Dataset and Baseline
Zhuolin Tan
Chenqiang Gao
Anyong Qin
Ruixin Chen
Tiecheng Song
Feng Yang
Deyu Meng
29
0
0
02 Sep 2024
EPAM-Net: An Efficient Pose-driven Attention-guided Multimodal Network for Video Action Recognition
EPAM-Net: An Efficient Pose-driven Attention-guided Multimodal Network for Video Action Recognition
Ahmed Abdelkawy
Asem A. Ali
Aly A. Farag
3DPC
26
0
0
10 Aug 2024
Lighthouse: A User-Friendly Library for Reproducible Video Moment
  Retrieval and Highlight Detection
Lighthouse: A User-Friendly Library for Reproducible Video Moment Retrieval and Highlight Detection
Taichi Nishimura
Shota Nakada
Hokuto Munakata
Tatsuya Komatsu
VLM
14
1
0
06 Aug 2024
RICA2: Rubric-Informed, Calibrated Assessment of Actions
RICA2: Rubric-Informed, Calibrated Assessment of Actions
Abrar Majeedi
Viswanatha Reddy Gajjala
Satya Sai Srinath Namburi Gnvv
Yin Li
CML
26
2
0
04 Aug 2024
Faster Diffusion Action Segmentation
Faster Diffusion Action Segmentation
Shuai Wang
Shunli Wang
Mingcheng Li
Dingkang Yang
Haopeng Kuang
Ziyun Qian
Lihua Zhang
34
0
0
04 Aug 2024
MambaST: A Plug-and-Play Cross-Spectral Spatial-Temporal Fuser for
  Efficient Pedestrian Detection
MambaST: A Plug-and-Play Cross-Spectral Spatial-Temporal Fuser for Efficient Pedestrian Detection
Xiangbo Gao
A. Kanu-Asiegbu
Xiaoxiao Du
Mamba
35
0
0
02 Aug 2024
Learning Video Context as Interleaved Multimodal Sequences
Learning Video Context as Interleaved Multimodal Sequences
S. Shao
Pengchuan Zhang
Y. Li
Xide Xia
A. Meso
Ziteng Gao
Jinheng Xie
N. Holliman
Mike Zheng Shou
43
5
0
31 Jul 2024
Effectively Leveraging CLIP for Generating Situational Summaries of Images and Videos
Effectively Leveraging CLIP for Generating Situational Summaries of Images and Videos
Dhruv Verma
Debaditya Roy
Basura Fernando
27
1
0
30 Jul 2024
1234...91011
Next