ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2105.03245
  4. Cited By
Adaptive Focus for Efficient Video Recognition
v1v2 (latest)

Adaptive Focus for Efficient Video Recognition

IEEE International Conference on Computer Vision (ICCV), 2021
7 May 2021
Yulin Wang
Zhaoxi Chen
Haojun Jiang
Shiji Song
Yizeng Han
Gao Huang
ArXiv (abs)PDFHTMLGithub (124★)

Papers citing "Adaptive Focus for Efficient Video Recognition"

50 / 71 papers shown
Linear Differential Vision Transformer: Learning Visual Contrasts via Pairwise Differentials
Linear Differential Vision Transformer: Learning Visual Contrasts via Pairwise Differentials
Yifan Pu
Jixuan Ying
Qixiu Li
Tianzhu Ye
Dongchen Han
Xiaochen Wang
Ziyi Wang
Xinyu Shao
Gao Huang
Xiu Li
ViT
125
0
0
02 Nov 2025
Learning to Recognize Correctly Completed Procedure Steps in Egocentric Assembly Videos through Spatio-Temporal Modeling
Learning to Recognize Correctly Completed Procedure Steps in Egocentric Assembly Videos through Spatio-Temporal ModelingComputer Vision and Image Understanding (CVIU), 2025
Tim J. Schoonbeek
Shao-Hsuan Hung
Dan Lehman
H. Onvlee
Jacek Kustra
Peter H. N. de With
Fons van der Sommen
121
0
0
14 Oct 2025
A Survey on Efficiency Optimization Techniques for DNN-based Video Analytics: Process Systems, Algorithms, and Applications
A Survey on Efficiency Optimization Techniques for DNN-based Video Analytics: Process Systems, Algorithms, and Applications
Shanjiang Tang
Rui Huang
Hsinyu Luo
C. Wang
Ce Yu
Yusen Li
Hao Fu
Chao Sun
and Jian Xiao
157
0
0
21 Jul 2025
Flash-VStream: Efficient Real-Time Understanding for Long Video Streams
Flash-VStream: Efficient Real-Time Understanding for Long Video Streams
Haoji Zhang
Yiqin Wang
Yansong Tang
Yong-Jin Liu
Jiashi Feng
Xiaojie Jin
VLM
265
11
0
30 Jun 2025
Dynamic-Aware Video Distillation: Optimizing Temporal Resolution Based on Video Semantics
Dynamic-Aware Video Distillation: Optimizing Temporal Resolution Based on Video Semantics
Yinjie Zhao
Heng Zhao
Bihan Wen
Yew-Soon Ong
Joey Tianyi Zhou
VGen
179
1
0
28 May 2025
Soften the Mask: Adaptive Temporal Soft Mask for Efficient Dynamic Facial Expression Recognition
Soften the Mask: Adaptive Temporal Soft Mask for Efficient Dynamic Facial Expression Recognition
Mengzhu Li
Quanxing Zha
Hongjun Wu
CVBM
221
1
0
28 Feb 2025
ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation
ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image GenerationComputer Vision and Pattern Recognition (CVPR), 2025
Yifan Pu
Yiming Zhao
Zhicong Tang
Ruihong Yin
Haoxing Ye
...
Ji Li
Xiu Li
Zheng Lian
Gao Huang
Baining Guo
DiffM
406
20
0
25 Feb 2025
Uni-AdaFocus: Spatial-temporal Dynamic Computation for Video Recognition
Uni-AdaFocus: Spatial-temporal Dynamic Computation for Video RecognitionIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
Yulin Wang
Haoji Zhang
Yang Yue
Shiji Song
Chao Deng
Junlan Feng
Gao Huang
286
12
0
15 Dec 2024
ENAT: Rethinking Spatial-temporal Interactions in Token-based Image
  Synthesis
ENAT: Rethinking Spatial-temporal Interactions in Token-based Image SynthesisNeural Information Processing Systems (NeurIPS), 2024
Zanlin Ni
Yulin Wang
Renping Zhou
Yizeng Han
Jiayi Guo
Zhiyuan Liu
Xingtai Lv
Gao Huang
269
10
0
11 Nov 2024
Dynamic Diffusion Transformer
Dynamic Diffusion TransformerInternational Conference on Learning Representations (ICLR), 2024
Wangbo Zhao
Yizeng Han
Jiasheng Tang
Kai Wang
Yibing Song
Gao Huang
Fan Wang
Yang You
329
33
0
04 Oct 2024
AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation
AdaNAT: Exploring Adaptive Policy for Token-Based Image GenerationEuropean Conference on Computer Vision (ECCV), 2024
Zanlin Ni
Yulin Wang
Renping Zhou
Rui Lu
Jiayi Guo
Jinyi Hu
Zhiyuan Liu
Yuan Yao
Gao Huang
349
15
0
31 Aug 2024
UltraSeP: Sequence-aware Pre-training for Echocardiography Probe Movement Guidance
UltraSeP: Sequence-aware Pre-training for Echocardiography Probe Movement GuidancePattern Recognition (Pattern Recogn.), 2024
Haojun Jiang
Teng Wang
Zhenguo Sun
Yulin Wang
Yang Yue
...
Ning Jia
Meng Li
Shaqi Luo
Shiji Song
Gao Huang
241
1
0
27 Aug 2024
Efficient Diffusion Transformer with Step-wise Dynamic Attention
  Mediators
Efficient Diffusion Transformer with Step-wise Dynamic Attention MediatorsEuropean Conference on Computer Vision (ECCV), 2024
Yifan Pu
Zhuofan Xia
Jiayi Guo
Dongchen Han
Qixiu Li
...
Ji Li
Yizeng Han
Shiji Song
Gao Huang
Xiu Li
331
22
0
11 Aug 2024
Fine-grained Dynamic Network for Generic Event Boundary Detection
Fine-grained Dynamic Network for Generic Event Boundary Detection
Ziwei Zheng
Lijun He
Le Yang
Fan Li
195
2
0
05 Jul 2024
DyFADet: Dynamic Feature Aggregation for Temporal Action Detection
DyFADet: Dynamic Feature Aggregation for Temporal Action Detection
Le Yang
Ziwei Zheng
Yizeng Han
Hao-Ran Cheng
Shiji Song
Gao Huang
Fan Li
296
21
0
03 Jul 2024
Structure-aware World Model for Probe Guidance via Large-scale
  Self-supervised Pre-train
Structure-aware World Model for Probe Guidance via Large-scale Self-supervised Pre-train
Haojun Jiang
Meng Li
Zhenguo Sun
Ning Jia
Yu Sun
Shaqi Luo
Shiji Song
Gao Huang
255
5
0
28 Jun 2024
Rule Based Learning with Dynamic (Graph) Neural Networks
Rule Based Learning with Dynamic (Graph) Neural Networks
Florian Seiffarth
224
1
0
14 Jun 2024
No Time to Waste: Squeeze Time into Channel for Mobile Video
  Understanding
No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding
Yingjie Zhai
Wenshuo Li
Yehui Tang
Xinghao Chen
Yunhe Wang
ViT
223
2
0
14 May 2024
Chain-of-Spot: Interactive Reasoning Improves Large Vision-Language
  Models
Chain-of-Spot: Interactive Reasoning Improves Large Vision-Language Models
Zuyan Liu
Yuhao Dong
Yongming Rao
Jie Zhou
Jiwen Lu
LRM
239
43
0
19 Mar 2024
Dynamic Tuning Towards Parameter and Inference Efficiency for ViT
  Adaptation
Dynamic Tuning Towards Parameter and Inference Efficiency for ViT AdaptationNeural Information Processing Systems (NeurIPS), 2024
Wangbo Zhao
Jiasheng Tang
Yizeng Han
Yibing Song
Kai Wang
Gao Huang
F. Wang
Yang You
320
23
0
18 Mar 2024
GRA: Detecting Oriented Objects through Group-wise Rotating and
  Attention
GRA: Detecting Oriented Objects through Group-wise Rotating and Attention
Jiangshan Wang
Yifan Pu
Yizeng Han
Jiayi Guo
Yiru Wang
Xiu Li
Gao Huang
311
17
0
17 Mar 2024
2023 Low-Power Computer Vision Challenge (LPCVC) Summary
2023 Low-Power Computer Vision Challenge (LPCVC) Summary
Leo Chen
Benjamin Boardley
Ping Hu
Yiru Wang
Yifan Pu
...
Arseny Yanchenko
S. Alyamkin
Xiaowei Hu
George K. Thiruvathukal
Yu Lu
148
2
0
11 Mar 2024
HaltingVT: Adaptive Token Halting Transformer for Efficient Video
  Recognition
HaltingVT: Adaptive Token Halting Transformer for Efficient Video RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Qian Wu
Ruoxuan Cui
Yuke Li
Haoqi Zhu
ViT
227
5
0
10 Jan 2024
Text-Conditioned Resampler For Long Form Video Understanding
Text-Conditioned Resampler For Long Form Video Understanding
Bruno Korbar
Yongqin Xian
A. Tonioni
Andrew Zisserman
Federico Tombari
305
23
0
19 Dec 2023
GSVA: Generalized Segmentation via Multimodal Large Language Models
GSVA: Generalized Segmentation via Multimodal Large Language ModelsComputer Vision and Pattern Recognition (CVPR), 2023
Zhuofan Xia
Dongchen Han
Yizeng Han
Xuran Pan
Shiji Song
Gao Huang
VLM
596
125
0
15 Dec 2023
Rank-DETR for High Quality Object Detection
Rank-DETR for High Quality Object DetectionNeural Information Processing Systems (NeurIPS), 2023
Yifan Pu
Weicong Liang
Yiduo Hao
Yuhui Yuan
Yukang Yang
Chao Zhang
Hanhua Hu
Gao Huang
378
92
0
13 Oct 2023
Training a Large Video Model on a Single Machine in a Day
Training a Large Video Model on a Single Machine in a Day
Yue Zhao
Philipp Krahenbuhl
VLM
273
23
0
28 Sep 2023
Differentiable Resolution Compression and Alignment for Efficient Video
  Classification and Retrieval
Differentiable Resolution Compression and Alignment for Efficient Video Classification and RetrievalIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Rui Deng
Qian Wu
Yuke Li
Haoran Fu
212
3
0
15 Sep 2023
Fine-grained Recognition with Learnable Semantic Data Augmentation
Fine-grained Recognition with Learnable Semantic Data AugmentationIEEE Transactions on Image Processing (IEEE TIP), 2023
Yifan Pu
Yizeng Han
Yulin Wang
Junlan Feng
Chao Deng
Gao Huang
276
51
0
01 Sep 2023
Computation-efficient Deep Learning for Computer Vision: A Survey
Computation-efficient Deep Learning for Computer Vision: A Survey
Yulin Wang
Yizeng Han
Chaofei Wang
Shiji Song
Qi Tian
Gao Huang
VLM
303
32
0
27 Aug 2023
Audio-Visual Glance Network for Efficient Video Recognition
Audio-Visual Glance Network for Efficient Video RecognitionIEEE International Conference on Computer Vision (ICCV), 2023
Muhammad Adi Nugroho
Sangmin Woo
Sumin Lee
Changick Kim
154
8
0
18 Aug 2023
AdaBrowse: Adaptive Video Browser for Efficient Continuous Sign Language
  Recognition
AdaBrowse: Adaptive Video Browser for Efficient Continuous Sign Language RecognitionACM Multimedia (ACM MM), 2023
Lianyu Hu
Liqing Gao
Zekang Liu
Chi-Man Pun
Wei Feng
SLR
255
32
0
16 Aug 2023
View while Moving: Efficient Video Recognition in Long-untrimmed Videos
View while Moving: Efficient Video Recognition in Long-untrimmed VideosACM Multimedia (ACM MM), 2023
Ye Tian
Meng Yang
Lanshan Zhang
Zhizhen Zhang
Yang Liu
Xiao-Zhu Xie
Xirong Que
Wendong Wang
261
10
0
09 Aug 2023
Prune Spatio-temporal Tokens by Semantic-aware Temporal Accumulation
Prune Spatio-temporal Tokens by Semantic-aware Temporal AccumulationIEEE International Conference on Computer Vision (ICCV), 2023
Shuangrui Ding
Peisen Zhao
Xiaopeng Zhang
Rui Qian
H. Xiong
Qi Tian
ViT
206
26
0
08 Aug 2023
How can objects help action recognition?
How can objects help action recognition?Computer Vision and Pattern Recognition (CVPR), 2023
Xingyi Zhou
Anurag Arnab
Chen Sun
Cordelia Schmid
226
25
0
20 Jun 2023
Dynamic Perceiver for Efficient Visual Recognition
Dynamic Perceiver for Efficient Visual RecognitionIEEE International Conference on Computer Vision (ICCV), 2023
Yizeng Han
Dongchen Han
Zeyu Liu
Yulin Wang
Xuran Pan
Yifan Pu
Chaorui Deng
Junlan Feng
Qing Xiao
Gao Huang
295
40
0
20 Jun 2023
Few-shot Action Recognition via Intra- and Inter-Video Information
  Maximization
Few-shot Action Recognition via Intra- and Inter-Video Information Maximization
Huabin Liu
W. Lin
Yun Xu
Yuxi Li
Shuyuan Li
John See
224
9
0
10 May 2023
Efficient Video Action Detection with Token Dropout and Context
  Refinement
Efficient Video Action Detection with Token Dropout and Context RefinementIEEE International Conference on Computer Vision (ICCV), 2023
Lei Chen
Zhan Tong
Yibing Song
Gangshan Wu
Limin Wang
305
25
0
17 Apr 2023
Frame Flexible Network
Frame Flexible NetworkComputer Vision and Pattern Recognition (CVPR), 2023
Yitian Zhang
Yue Bai
Chang Liu
Huan Wang
Sheng Li
Yun Fu
197
5
0
26 Mar 2023
Adaptive Rotated Convolution for Rotated Object Detection
Adaptive Rotated Convolution for Rotated Object DetectionIEEE International Conference on Computer Vision (ICCV), 2023
Yifan Pu
Yiru Wang
Zhuofan Xia
Yizeng Han
Yulin Wang
Weihao Gan
Zidong Wang
Qing Xiao
Gao Huang
209
127
0
14 Mar 2023
EgoDistill: Egocentric Head Motion Distillation for Efficient Video
  Understanding
EgoDistill: Egocentric Head Motion Distillation for Efficient Video UnderstandingNeural Information Processing Systems (NeurIPS), 2023
Shuhan Tan
Tushar Nagarajan
Kristen Grauman
243
34
0
05 Jan 2023
Cross Modal Transformer: Towards Fast and Robust 3D Object Detection
Cross Modal Transformer: Towards Fast and Robust 3D Object DetectionIEEE International Conference on Computer Vision (ICCV), 2023
Junjie Yan
Yingfei Liu
Jian‐Yuan Sun
Fan Jia
Shuailin Li
Tiancai Wang
Xiangyu Zhang
ViT3DPC
308
110
0
03 Jan 2023
Bidirectional Cross-Modal Knowledge Exploration for Video Recognition
  with Pre-trained Vision-Language Models
Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language ModelsComputer Vision and Pattern Recognition (CVPR), 2022
Wenhao Wu
Xiaohan Wang
Haipeng Luo
Jingdong Wang
Yi Yang
Wanli Ouyang
390
80
0
31 Dec 2022
Deep Incubation: Training Large Models by Divide-and-Conquering
Deep Incubation: Training Large Models by Divide-and-ConqueringIEEE International Conference on Computer Vision (ICCV), 2022
Zanlin Ni
Yulin Wang
Jiangwei Yu
Haojun Jiang
Yu Cao
Gao Huang
VLM
239
13
0
08 Dec 2022
Look More but Care Less in Video Recognition
Look More but Care Less in Video RecognitionNeural Information Processing Systems (NeurIPS), 2022
Yitian Zhang
Yue Bai
Haiquan Wang
Yi Xu
Yun Fu
216
12
0
18 Nov 2022
EfficientTrain: Exploring Generalized Curriculum Learning for Training
  Visual Backbones
EfficientTrain: Exploring Generalized Curriculum Learning for Training Visual BackbonesIEEE International Conference on Computer Vision (ICCV), 2022
Yulin Wang
Yang Yue
Rui Lu
Tian-De Liu
Zhaobai Zhong
Qing Xiao
Gao Huang
307
33
0
17 Nov 2022
Cross-Modal Adapter for Vision-Language Retrieval
Cross-Modal Adapter for Vision-Language RetrievalPattern Recognition (Pattern Recogn.), 2022
Haojun Jiang
Jianke Zhang
Rui Huang
Chunjiang Ge
Zanlin Ni
Jiwen Lu
Gao Huang
360
43
0
17 Nov 2022
Active Acquisition for Multimodal Temporal Data: A Challenging
  Decision-Making Task
Active Acquisition for Multimodal Temporal Data: A Challenging Decision-Making Task
Jannik Kossen
Cătălina Cangea
Eszter Vértes
Andrew Jaegle
Viorica Patraucean
Ira Ktena
Nenad Tomašev
Danielle Belgrave
278
12
0
09 Nov 2022
GliTr: Glimpse Transformers with Spatiotemporal Consistency for Online
  Action Prediction
GliTr: Glimpse Transformers with Spatiotemporal Consistency for Online Action PredictionIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Samrudhdhi B. Rangrej
Kevin J. Liang
Tal Hassner
James J. Clark
268
4
0
24 Oct 2022
AdaFocusV3: On Unified Spatial-temporal Dynamic Video Recognition
AdaFocusV3: On Unified Spatial-temporal Dynamic Video RecognitionEuropean Conference on Computer Vision (ECCV), 2022
Yulin Wang
Yang Yue
Xin-Wen Xu
Ali Hassani
V. Kulikov
Nikita Orlov
Qing Xiao
Humphrey Shi
Gao Huang
228
20
0
27 Sep 2022
12
Next