ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1503.08909
  4. Cited By
Beyond Short Snippets: Deep Networks for Video Classification

Beyond Short Snippets: Deep Networks for Video Classification

31 March 2015
Joe Yue-Hei Ng
Matthew J. Hausknecht
Sudheendra Vijayanarasimhan
Oriol Vinyals
R. Monga
G. Toderici
ArXivPDFHTML

Papers citing "Beyond Short Snippets: Deep Networks for Video Classification"

50 / 739 papers shown
Title
REEF: Relevance-Aware and Efficient LLM Adapter for Video Understanding
REEF: Relevance-Aware and Efficient LLM Adapter for Video Understanding
Sakib Reza
Xiyun Song
Heather Yu
Zongfang Lin
Mohsen Moghaddam
Mario Sznaier
31
0
0
07 Apr 2025
Chapter-Llama: Efficient Chaptering in Hour-Long Videos with LLMs
Chapter-Llama: Efficient Chaptering in Hour-Long Videos with LLMs
Lucas Ventura
Antoine Yang
Cordelia Schmid
Gül Varol
41
0
0
31 Mar 2025
Towards Scalable Modeling of Compressed Videos for Efficient Action Recognition
Towards Scalable Modeling of Compressed Videos for Efficient Action Recognition
Shristi Das Biswas
Efstathia Soufleri
Arani Roy
Kaushik Roy
59
0
0
17 Mar 2025
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis
Amir Hosein Fadaei
M. Dehaqani
45
0
0
11 Feb 2025
Baichuan-Omni-1.5 Technical Report
Yadong Li
Qingbin Liu
Tao Zhang
Tao Zhang
Tian Jin
...
Jianhua Xu
Haoze Sun
Mingan Lin
Zenan Zhou
Xin Wu
AuLLM
75
13
0
28 Jan 2025
Uni-AdaFocus: Spatial-temporal Dynamic Computation for Video Recognition
Uni-AdaFocus: Spatial-temporal Dynamic Computation for Video Recognition
Yulin Wang
Haoji Zhang
Yang Yue
Shiji Song
Chao Deng
Junlan Feng
Gao Huang
86
3
0
15 Dec 2024
An End-to-End Two-Stream Network Based on RGB Flow and Representation
  Flow for Human Action Recognition
An End-to-End Two-Stream Network Based on RGB Flow and Representation Flow for Human Action Recognition
Song-Jiang Lai
Tsun-hin Cheung
Ka-Chun Fung
Tian-Shan Liu
K. Lam
63
0
0
27 Nov 2024
A Survey of Recent Advances and Challenges in Deep Audio-Visual Correlation Learning
Luis Vilaca
Yi Yu
Paula Vinan
75
0
0
24 Nov 2024
Gating Syn-to-Real Knowledge for Pedestrian Crossing Prediction in Safe
  Driving
Gating Syn-to-Real Knowledge for Pedestrian Crossing Prediction in Safe Driving
Jie Bai
Jianwu Fang
Yisheng Lv
Chen Lv
Jianru Xue
Zhengguo Li
37
0
0
24 Aug 2024
Faster Diffusion Action Segmentation
Faster Diffusion Action Segmentation
Shuai Wang
Shunli Wang
Mingcheng Li
Dingkang Yang
Haopeng Kuang
Ziyun Qian
Lihua Zhang
42
0
0
04 Aug 2024
TransferAttn: Transferable-guided Attention Is All You Need for Video
  Domain Adaptation
TransferAttn: Transferable-guided Attention Is All You Need for Video Domain Adaptation
Andre Sacilotti
Samuel Felipe dos Santos
N. Sebe
Jurandy Almeida
ViT
50
1
0
01 Jul 2024
Financial Assets Dependency Prediction Utilizing Spatiotemporal Patterns
Financial Assets Dependency Prediction Utilizing Spatiotemporal Patterns
Haoren Zhu
Pengfei Zhao
Wilfred Siu Hung NG
Dik Lun Lee
29
0
0
13 Jun 2024
RNNs, CNNs and Transformers in Human Action Recognition: A Survey and a
  Hybrid Model
RNNs, CNNs and Transformers in Human Action Recognition: A Survey and a Hybrid Model
Khaled Alomar
Halil Ibrahim Aysel
Xiaohao Cai
MedIm
ViT
43
7
0
02 Jun 2024
mTREE: Multi-Level Text-Guided Representation End-to-End Learning for
  Whole Slide Image Analysis
mTREE: Multi-Level Text-Guided Representation End-to-End Learning for Whole Slide Image Analysis
Quan Liu
Ruining Deng
Can Cui
Tianyuan Yao
V. Nath
Yucheng Tang
Yuankai Huo
43
0
0
28 May 2024
Hierarchical Action Recognition: A Contrastive Video-Language Approach
  with Hierarchical Interactions
Hierarchical Action Recognition: A Contrastive Video-Language Approach with Hierarchical Interactions
Rui Zhang
Shuailong Li
Junxiao Xue
Feng Lin
Qing Zhang
Xiao Ma
Xiaoran Yan
34
0
0
28 May 2024
Learning text-to-video retrieval from image captioning
Learning text-to-video retrieval from image captioning
Lucas Ventura
Cordelia Schmid
Gül Varol
3DV
44
3
0
26 Apr 2024
MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video
  Understanding
MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding
Bo He
Hengduo Li
Young Kyun Jang
Menglin Jia
Xuefei Cao
Ashish Shah
Abhinav Shrivastava
Ser-Nam Lim
MLLM
83
89
0
08 Apr 2024
A Closer Look at Spatial-Slice Features Learning for COVID-19 Detection
A Closer Look at Spatial-Slice Features Learning for COVID-19 Detection
Chih-Chung Hsu
Chia-Ming Lee
Chiang Fan Yang
Yi-Shiuan Chou
Chih-Yu Jiang
Shen-Chieh Tai
Chin-Han Tsai
44
0
0
02 Apr 2024
Enhancing Video Transformers for Action Understanding with VLM-aided
  Training
Enhancing Video Transformers for Action Understanding with VLM-aided Training
Hui Lu
Hu Jian
Ronald Poppe
A. A. Salah
42
1
0
24 Mar 2024
Transformer-based Fusion of 2D-pose and Spatio-temporal Embeddings for
  Distracted Driver Action Recognition
Transformer-based Fusion of 2D-pose and Spatio-temporal Embeddings for Distracted Driver Action Recognition
Erkut Akdag
Zeqi Zhu
Egor Bondarev
Peter H. N. de With
ViT
37
5
0
11 Mar 2024
LLMs Meet Long Video: Advancing Long Video Comprehension with An
  Interactive Visual Adapter in LLMs
LLMs Meet Long Video: Advancing Long Video Comprehension with An Interactive Visual Adapter in LLMs
Yunxin Li
Xinyu Chen
Baotain Hu
Min-Ling Zhang
45
3
0
21 Feb 2024
Video Understanding with Large Language Models: A Survey
Video Understanding with Large Language Models: A Survey
Yunlong Tang
Jing Bi
Siting Xu
Luchuan Song
Susan Liang
...
Feng Zheng
Jianguo Zhang
Ping Luo
Jiebo Luo
Chenliang Xu
VLM
56
84
0
29 Dec 2023
Early Action Recognition with Action Prototypes
Early Action Recognition with Action Prototypes
G. Camporese
Alessandro Bergamo
Xunyu Lin
Joseph Tighe
Davide Modolo
EgoV
21
0
0
11 Dec 2023
DEVIAS: Learning Disentangled Video Representations of Action and Scene
  for Holistic Video Understanding
DEVIAS: Learning Disentangled Video Representations of Action and Scene for Holistic Video Understanding
Kyungho Bae
Geo Ahn
Youngrae Kim
Jinwoo Choi
30
3
0
30 Nov 2023
Object-based (yet Class-agnostic) Video Domain Adaptation
Object-based (yet Class-agnostic) Video Domain Adaptation
Dantong Niu
Amir Bar
Roei Herzig
Trevor Darrell
Anna Rohrbach
40
1
0
29 Nov 2023
Active Open-Vocabulary Recognition: Let Intelligent Moving Mitigate CLIP
  Limitations
Active Open-Vocabulary Recognition: Let Intelligent Moving Mitigate CLIP Limitations
Lei Fan
Jianxiong Zhou
Xiaoying Xing
Ying Wu
VLM
41
3
0
28 Nov 2023
Evidential Active Recognition: Intelligent and Prudent Open-World
  Embodied Perception
Evidential Active Recognition: Intelligent and Prudent Open-World Embodied Perception
Lei Fan
Mingfu Liang
Yunxuan Li
Gang Hua
Ying Wu
22
5
0
23 Nov 2023
GLAD: Global-Local View Alignment and Background Debiasing for
  Unsupervised Video Domain Adaptation with Large Domain Gap
GLAD: Global-Local View Alignment and Background Debiasing for Unsupervised Video Domain Adaptation with Large Domain Gap
Hyogun Lee
Kyungho Bae
Seong Jong Ha
Yumin Ko
Gyeong-Moon Park
Jinwoo Choi
16
2
0
21 Nov 2023
MVSA-Net: Multi-View State-Action Recognition for Robust and Deployable
  Trajectory Generation
MVSA-Net: Multi-View State-Action Recognition for Robust and Deployable Trajectory Generation
Ehsan Asali
Prashant Doshi
Jin Sun
33
1
0
14 Nov 2023
Subtle Signals: Video-based Detection of Infant Non-nutritive Sucking as
  a Neurodevelopmental Cue
Subtle Signals: Video-based Detection of Infant Non-nutritive Sucking as a Neurodevelopmental Cue
Shaotong Zhu
Michael Wan
Sai Kumar Reddy Manne
Emily B. Zimmerman
Sarah Ostadabbas
21
2
0
24 Oct 2023
Explore the Effect of Data Selection on Poison Efficiency in Backdoor
  Attacks
Explore the Effect of Data Selection on Poison Efficiency in Backdoor Attacks
Ziqiang Li
Pengfei Xia
Hong Sun
Yueqi Zeng
Wei Zhang
Bin Li
AAML
48
10
0
15 Oct 2023
Dual-Stream Knowledge-Preserving Hashing for Unsupervised Video
  Retrieval
Dual-Stream Knowledge-Preserving Hashing for Unsupervised Video Retrieval
P. Li
Hongtao Xie
Jiannan Ge
Lei Zhang
Shaobo Min
Yongdong Zhang
23
17
0
12 Oct 2023
A novel asymmetrical autoencoder with a sparsifying discrete cosine
  Stockwell transform layer for gearbox sensor data compression
A novel asymmetrical autoencoder with a sparsifying discrete cosine Stockwell transform layer for gearbox sensor data compression
Xin Zhu
Daoguang Yang
Hongyi Pan
Hamid Reza Karimi
Didem Ozevin
Ahmet Enis Cetin
24
14
0
04 Oct 2023
Revisiting Kernel Temporal Segmentation as an Adaptive Tokenizer for
  Long-form Video Understanding
Revisiting Kernel Temporal Segmentation as an Adaptive Tokenizer for Long-form Video Understanding
Mohamed Afham
Satya Narayan Shukla
Omid Poursaeed
Pengchuan Zhang
Ashish Shah
Sernam Lim
VLM
32
2
0
20 Sep 2023
Improving Video Violence Recognition with Human Interaction Learning on
  3D Skeleton Point Clouds
Improving Video Violence Recognition with Human Interaction Learning on 3D Skeleton Point Clouds
Yukun Su
Guosheng Lin
Qingyao Wu
3DH
3DPC
29
3
0
26 Aug 2023
Temporal-Distributed Backdoor Attack Against Video Based Action
  Recognition
Temporal-Distributed Backdoor Attack Against Video Based Action Recognition
Xi Li
Songhe Wang
Rui Huang
Mahanth K. Gowda
G. Kesidis
AAML
41
6
0
21 Aug 2023
Event-Guided Procedure Planning from Instructional Videos with Text
  Supervision
Event-Guided Procedure Planning from Instructional Videos with Text Supervision
Ante Wang
Kun-Li Channing Lin
Jiachen Du
Jingke Meng
Wei-Shi Zheng
23
15
0
17 Aug 2023
Temporal DINO: A Self-supervised Video Strategy to Enhance Action
  Prediction
Temporal DINO: A Self-supervised Video Strategy to Enhance Action Prediction
Izzeddin Teeti
Rongali Sai Bhargav
Vivek Singh
Andrew Bradley
Biplab Banerjee
Fabio Cuzzolin
19
1
0
08 Aug 2023
SkateboardAI: The Coolest Video Action Recognition for Skateboarding
SkateboardAI: The Coolest Video Action Recognition for Skateboarding
Hanxiao Chen
ViT
11
4
0
02 Aug 2023
Multi-Modal Machine Learning for Assessing Gaming Skills in Online
  Streaming: A Case Study with CS:GO
Multi-Modal Machine Learning for Assessing Gaming Skills in Online Streaming: A Case Study with CS:GO
Longxiang Zhang
Wenping Wang
51
1
0
23 Jul 2023
What Can Simple Arithmetic Operations Do for Temporal Modeling?
What Can Simple Arithmetic Operations Do for Temporal Modeling?
Wenhao Wu
Yuxin Song
Zhun Sun
Jingdong Wang
Chang Xu
Wanli Ouyang
40
8
0
18 Jul 2023
Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action
  Recognition
Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition
Syed Talal Wasim
Muhammad Uzair Khattak
Muzammal Naseer
Salman Khan
M. Shah
Fahad Shahbaz Khan
ViT
54
19
0
13 Jul 2023
Active Learning for Video Classification with Frame Level Queries
Active Learning for Video Classification with Frame Level Queries
D. Goswami
Shayok Chakraborty
VLM
13
2
0
10 Jul 2023
Iterative-in-Iterative Super-Resolution Biomedical Imaging Using One
  Real Image
Iterative-in-Iterative Super-Resolution Biomedical Imaging Using One Real Image
Yuanzheng Ma
Xinyue Wang
Benqi Zhao
Ying Xiao
Shijie Deng
Jian Song
Xun Guan
SupR
MedIm
24
1
0
26 Jun 2023
Efficient Online Processing with Deep Neural Networks
Efficient Online Processing with Deep Neural Networks
Lukas Hedegaard
26
0
0
23 Jun 2023
A Comprehensive Overview and Comparative Analysis on Deep Learning Models: CNN, RNN, LSTM, GRU
A Comprehensive Overview and Comparative Analysis on Deep Learning Models: CNN, RNN, LSTM, GRU
Farhad Shiri
Thinagaran Perumal
N. Mustapha
Raihani Mohamed
AILaw
SyDa
ELM
AI4TS
55
130
0
27 May 2023
Malicious or Benign? Towards Effective Content Moderation for Children's
  Videos
Malicious or Benign? Towards Effective Content Moderation for Children's Videos
Syed Hammad Ahmed
M. Khan
Hafiz Muhammad Umer Qaisar
G. Sukthankar
14
5
0
24 May 2023
Smart Pressure e-Mat for Human Sleeping Posture and Dynamic Activity
  Recognition
Smart Pressure e-Mat for Human Sleeping Posture and Dynamic Activity Recognition
Liangqi Yuan
Yuan Wei
Jia Li
26
3
0
19 May 2023
Is end-to-end learning enough for fitness activity recognition?
Is end-to-end learning enough for fitness activity recognition?
Antoine Mercier
Guillaume Berger
Sunny Panchal
Florian Letsch
Cornelius Boehm
Nahua Kang
Ingo Bax
Roland Memisevic
28
2
0
14 May 2023
Local and Global Contextual Features Fusion for Pedestrian Intention
  Prediction
Local and Global Contextual Features Fusion for Pedestrian Intention Prediction
Mohsen Azarmi
Mahdi Rezaei
Tanveer Hussain
Chenghao Qian
46
8
0
01 May 2023
1234...131415
Next