ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2303.09756
  4. Cited By
Video Action Recognition with Attentive Semantic Units

Video Action Recognition with Attentive Semantic Units

17 March 2023
Yifei Chen
Dapeng Chen
Ruijin Liu
Hao Li
Wei Peng
ArXivPDFHTML

Papers citing "Video Action Recognition with Attentive Semantic Units"

18 / 18 papers shown
Title
DUNE: Distilling a Universal Encoder from Heterogeneous 2D and 3D Teachers
DUNE: Distilling a Universal Encoder from Heterogeneous 2D and 3D Teachers
Mert Bulent Sariyildiz
Philippe Weinzaepfel
Thomas Lucas
Pau de Jorge
Diane Larlus
Yannis Kalantidis
52
0
0
18 Mar 2025
A Comprehensive Review of Few-shot Action Recognition
A Comprehensive Review of Few-shot Action Recognition
Yuyang Wanyan
Xiaoshan Yang
Weiming Dong
Changsheng Xu
VLM
52
3
0
20 Jul 2024
Pose-guided multi-task video transformer for driver action recognition
Pose-guided multi-task video transformer for driver action recognition
Ricardo Pizarro
Roberto Valle
L. Bergasa
J. M. Buenaposada
Luis Baumela
ViT
24
0
0
18 Jul 2024
Hierarchical Action Recognition: A Contrastive Video-Language Approach
  with Hierarchical Interactions
Hierarchical Action Recognition: A Contrastive Video-Language Approach with Hierarchical Interactions
Rui Zhang
Shuailong Li
Junxiao Xue
Feng Lin
Qing Zhang
Xiao Ma
Xiaoran Yan
19
0
0
28 May 2024
Rethinking CLIP-based Video Learners in Cross-Domain Open-Vocabulary
  Action Recognition
Rethinking CLIP-based Video Learners in Cross-Domain Open-Vocabulary Action Recognition
Kun-Yu Lin
Henghui Ding
Jiaming Zhou
Yu-Ming Tang
Yi-Xing Peng
Zhilin Zhao
Chen Change Loy
Wei-Shi Zheng
VLM
27
15
0
03 Mar 2024
OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for
  General Video Recognition
OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition
Tom Tongjia Chen
Hongshan Yu
Zhengeng Yang
Zechuan Li
Wei Sun
Chen Chen
10
7
0
30 Nov 2023
Align before Adapt: Leveraging Entity-to-Region Alignments for
  Generalizable Video Action Recognition
Align before Adapt: Leveraging Entity-to-Region Alignments for Generalizable Video Action Recognition
Yifei Chen
Dapeng Chen
Ruijin Liu
Sai Zhou
Wenyuan Xue
Wei Peng
20
6
0
27 Nov 2023
Towards Open Vocabulary Learning: A Survey
Towards Open Vocabulary Learning: A Survey
Jianzong Wu
Xiangtai Li
Shilin Xu
Haobo Yuan
Henghui Ding
...
Jiangning Zhang
Yu Tong
Xudong Jiang
Bernard Ghanem
Dacheng Tao
ObjD
VLM
25
134
0
28 Jun 2023
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
UniFormer: Unifying Convolution and Self-attention for Visual
  Recognition
UniFormer: Unifying Convolution and Self-attention for Visual Recognition
Kunchang Li
Yali Wang
Junhao Zhang
Peng Gao
Guanglu Song
Yu Liu
Hongsheng Li
Yu Qiao
ViT
142
360
0
24 Jan 2022
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text
  Understanding
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Hu Xu
Gargi Ghosh
Po-Yao (Bernie) Huang
Dmytro Okhonko
Armen Aghajanyan
Florian Metze
Luke Zettlemoyer
Florian Metze Luke Zettlemoyer Christoph Feichtenhofer
CLIP
VLM
245
554
0
28 Sep 2021
ActionCLIP: A New Paradigm for Video Action Recognition
ActionCLIP: A New Paradigm for Video Action Recognition
Mengmeng Wang
Jiazheng Xing
Yong Liu
VLM
149
360
0
17 Sep 2021
Learning to Prompt for Vision-Language Models
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
322
2,249
0
02 Sep 2021
Open-vocabulary Object Detection via Vision and Language Knowledge
  Distillation
Open-vocabulary Object Detection via Vision and Language Knowledge Distillation
Xiuye Gu
Tsung-Yi Lin
Weicheng Kuo
Yin Cui
VLM
ObjD
220
897
0
28 Apr 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
3,683
0
11 Feb 2021
Is Space-Time Attention All You Need for Video Understanding?
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
278
1,939
0
09 Feb 2021
Rethinking Zero-shot Video Classification: End-to-end Training for
  Realistic Applications
Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications
Biagio Brattoli
Joseph Tighe
Fedor Zhdanov
Pietro Perona
Krzysztof Chalupka
VLM
126
127
0
03 Mar 2020
1