ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2212.01638
  4. Cited By
VLG: General Video Recognition with Web Textual Knowledge

VLG: General Video Recognition with Web Textual Knowledge

3 December 2022
Jintao Lin
Zhaoyang Liu
Wenhai Wang
Wayne Wu
Limin Wang
ArXivPDFHTML

Papers citing "VLG: General Video Recognition with Web Textual Knowledge"

11 / 11 papers shown
Title
Revisiting Classifier: Transferring Vision-Language Models for Video
  Recognition
Revisiting Classifier: Transferring Vision-Language Models for Video Recognition
Wenhao Wu
Zhun Sun
Wanli Ouyang
VLM
87
93
0
04 Jul 2022
Survey: Transformer based Video-Language Pre-training
Survey: Transformer based Video-Language Pre-training
Ludan Ruan
Qin Jin
VLM
ViT
61
44
0
21 Sep 2021
ActionCLIP: A New Paradigm for Video Action Recognition
ActionCLIP: A New Paradigm for Video Action Recognition
Mengmeng Wang
Jiazheng Xing
Yong Liu
VLM
149
360
0
17 Sep 2021
VATT: Transformers for Multimodal Self-Supervised Learning from Raw
  Video, Audio and Text
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Hassan Akbari
Liangzhe Yuan
Rui Qian
Wei-Hong Chuang
Shih-Fu Chang
Yin Cui
Boqing Gong
ViT
231
573
0
22 Apr 2021
CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip
  Retrieval
CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval
Huaishao Luo
Lei Ji
Ming Zhong
Yang Chen
Wen Lei
Nan Duan
Tianrui Li
CLIP
VLM
303
771
0
18 Apr 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
3,683
0
11 Feb 2021
Is Space-Time Attention All You Need for Video Understanding?
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
278
1,939
0
09 Feb 2021
Depthwise Spatio-Temporal STFT Convolutional Neural Networks for Human
  Action Recognition
Depthwise Spatio-Temporal STFT Convolutional Neural Networks for Human Action Recognition
Sudhakar Kumawat
Manisha Verma
Yuta Nakashima
S. Raman
126
42
0
22 Jul 2020
Equalization Loss for Long-Tailed Object Recognition
Equalization Loss for Long-Tailed Object Recognition
Jingru Tan
Changbao Wang
Buyu Li
Quanquan Li
Wanli Ouyang
Changqing Yin
Junjie Yan
237
455
0
11 Mar 2020
BBN: Bilateral-Branch Network with Cumulative Learning for Long-Tailed
  Visual Recognition
BBN: Bilateral-Branch Network with Cumulative Learning for Long-Tailed Visual Recognition
Boyan Zhou
Quan Cui
Xiu-Shen Wei
Zhao-Min Chen
240
765
0
05 Dec 2019
SMOTE: Synthetic Minority Over-sampling Technique
SMOTE: Synthetic Minority Over-sampling Technique
Nitesh V. Chawla
Kevin W. Bowyer
Lawrence Hall
W. Kegelmeyer
AI4TS
160
25,150
0
09 Jun 2011
1