v1v2 (latest)

EVA: Exploring the Limits of Masked Visual Representation Learning at Scale

Computer Vision and Pattern Recognition (CVPR), 2022

14 November 2022

ArXiv (abs)PDF HTML HuggingFace (1 upvotes)Github (2496★)

Papers citing "EVA: Exploring the Limits of Masked Visual Representation Learning at Scale"

50 / 579 papers shown

Plug n' Play: Channel Shuffle Module for Enhancing Tiny Vision TransformersInternational Conference on Digital Image Computing: Techniques and Applications (DICTA), 2023

241

09 Oct 2023

Low-Resolution Self-Attention for Semantic SegmentationIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023

464

08 Oct 2023

Enhancing Representations through Heterogeneous Self-Supervised LearningIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023

359

08 Oct 2023

Improved Baselines with Visual Instruction TuningComputer Vision and Pattern Recognition (CVPR), 2023

606

4,171

05 Oct 2023

Expedited Training of Visual Conditioned Language Generation via Redundancy ReductionAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

355

05 Oct 2023

Text-image Alignment for Diffusion-based PerceptionComputer Vision and Pattern Recognition (CVPR), 2023

495

29 Sep 2023

InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition

...

Conghui He

Yu Qiao

790

307

26 Sep 2023

MoCaE: Mixture of Calibrated Experts Significantly Improves Object Detection

510

26 Sep 2023

Viewpoint Integration and Registration with Vision Language Foundation Model for Image Change Understanding

Fan Wang

149

15 Sep 2023

MMICL: Empowering Vision-language Model with Multi-Modal In-Context LearningInternational Conference on Learning Representations (ICLR), 2023

Zefan Cai

Xiaojian Ma

448

184

14 Sep 2023

Beyond Generation: Harnessing Text to Image Models for Object Detection and Segmentation

215

12 Sep 2023

Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual TokenizationInternational Conference on Learning Representations (ICLR), 2023

Kun Xu

...

233

09 Sep 2023

Do We Still Need Non-Maximum Suppression? Accurate Confidence Estimates and Implicit Duplication Modeling with IoU-Aware CalibrationIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023

220

06 Sep 2023

Image Aesthetics Assessment via Learnable QueriesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

197

06 Sep 2023

NICE: CVPR 2023 Challenge on Zero-shot Image Captioning

...

230

05 Sep 2023

DAT++: Spatially Dynamic Vision Transformer with Deformable Attention

Gao Huang

248

04 Sep 2023

RevColV2: Exploring Disentangled Representations in Masked Image ModelingNeural Information Processing Systems (NeurIPS), 2023

Qi Han

Yuxuan Cai

Xiangyu Zhang

303

02 Sep 2023

Contrastive Feature Masking Open-Vocabulary Vision TransformerIEEE International Conference on Computer Vision (ICCV), 2023

326

02 Sep 2023

Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models

182

31 Aug 2023

A General-Purpose Self-Supervised Model for Computational Pathology

Richard J. Chen

Tong Ding

Ming Y. Lu

Drew F. K. Williamson

...

327

29 Aug 2023

VIGC: Visual Instruction Generation and CorrectionAAAI Conference on Artificial Intelligence (AAAI), 2023

Huaping Zhong

...

Conghui He

328

24 Aug 2023

Spatial Transform Decoupling for Oriented Object DetectionAAAI Conference on Artificial Intelligence (AAAI), 2023

Hongtian Yu

Yunjie Tian

QiXiang Ye

Yunfan Liu

249

21 Aug 2023

ViT-Lens: Initiating Omni-Modal Exploration through 3D Insights

Ying Shan

167

20 Aug 2023

A Unified Interactive Model Evaluation for Classification, Object Detection, and Instance Segmentation in Computer VisionIEEE Transactions on Visualization and Computer Graphics (TVCG), 2023

Jing Wu

Hang Su

Hanspeter Pfister

Shixia Liu

227

09 Aug 2023

High-Level Parallelism and Nested Features for Dynamic Inference Cost and Top-Down Attention

194

09 Aug 2023

Fine-tuning Multimodal LLMs to Follow Zero-shot Demonstrative InstructionsInternational Conference on Learning Representations (ICLR), 2023

Wei Ji

312

08 Aug 2023

Tiny LVLM-eHub: Early Multimodal Experiments with BardIEEE Transactions on Big Data (IEEE Trans. Big Data), 2023

...

Ping Luo

207

07 Aug 2023

MM-Vet: Evaluating Large Multimodal Models for Integrated CapabilitiesInternational Conference on Machine Learning (ICML), 2023

Weihao Yu

Zicheng Liu

541

1,029

04 Aug 2023

A Parameter-efficient Multi-subject Model for Predicting fMRI Activity

Connor Lane

Gregory Kiar

167

04 Aug 2023

The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open WorldInternational Conference on Learning Representations (ICLR), 2023

...

Zhiguo Cao

Yu Qiao

270

118

03 Aug 2023

DETR Doesn't Need Multi-Scale or Locality Design

267

03 Aug 2023

RegionBLIP: A Unified Multi-modal Pre-training Framework for Holistic and Regional Comprehension

Fan Wang

169

03 Aug 2023

Guided Distillation for Semi-Supervised Instance SegmentationIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023

206

03 Aug 2023

Improving Pixel-based MIM by Reducing Wasted Modeling CapabilityIEEE International Conference on Computer Vision (ICCV), 2023

208

01 Aug 2023

MovieChat: From Dense Token to Sparse Memory for Long Video UnderstandingComputer Vision and Pattern Recognition (CVPR), 2023

...

620

453

31 Jul 2023

CLIP Brings Better Features to Visual Aesthetics Learners

212

28 Jul 2023

Human-centric Scene Understanding for 3D Large-scale ScenariosIEEE International Conference on Computer Vision (ICCV), 2023

Xinge Zhu

Jingyi Yu

Yuexin Ma

3DV

185

26 Jul 2023

Foundational Models Defining a New Era in Vision: A Survey and Outlook

Muhammad Awais

Muzammal Naseer

Salman Khan

Rao Muhammad Anwer

Hisham Cholakkal

430

152

25 Jul 2023

CLIP-KD: An Empirical Study of CLIP Model DistillationComputer Vision and Pattern Recognition (CVPR), 2023

345

24 Jul 2023

COCO-O: A Benchmark for Object Detectors under Natural Distribution ShiftsIEEE International Conference on Computer Vision (ICCV), 2023

Hang Su

235

24 Jul 2023

GEM: Boost Simple Network for Glass Surface Segmentation via Vision Foundation ModelsIEEE transactions on multimedia (IEEE TMM), 2023

273

22 Jul 2023

CARTIER: Cartographic lAnguage Reasoning Targeted at Instruction Execution for RobotsIEEE International Conference on Robotics and Automation (ICRA), 2023

283

21 Jul 2023

Watch out Venomous Snake Species: A Solution to SnakeCLEF2023Conference and Labs of the Evaluation Forum (CLEF), 2023

214

19 Jul 2023

MVA2023 Small Object Detection Challenge for Spotting Birds: Dataset, Methods, and Results

...

194

18 Jul 2023

Bootstrapping Vision-Language Learning with Decoupled Language Pre-trainingNeural Information Processing Systems (NeurIPS), 2023

388

13 Jul 2023

Self-regulating Prompts: Foundational Model Adaptation without ForgettingIEEE International Conference on Computer Vision (ICCV), 2023

Muhammad Uzair Khattak

Salman Khan

386

308

13 Jul 2023

mBLIP: Efficient Bootstrapping of Multilingual Vision-LLMs

Radu Timofte

228

13 Jul 2023

What Matters in Training a GPT4-Style Language Model with Multimodal Inputs?North American Chapter of the Association for Computational Linguistics (NAACL), 2023

317

05 Jul 2023

Surgical fine-tuning for Grape Bunch Segmentation under Visual Domain ShiftsEuropean Conference on Mobile Robots (ECMR), 2023

150

03 Jul 2023

Stitched ViTs are Flexible Vision BackbonesEuropean Conference on Computer Vision (ECCV), 2023

Zizheng Pan

Jing Liu

Haoyu He

Jianfei Cai

Bohan Zhuang

187

30 Jun 2023