ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2301.00808
  4. Cited By
ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders

ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders

2 January 2023
Sanghyun Woo
Shoubhik Debnath
Ronghang Hu
Xinlei Chen
Zhuang Liu
In So Kweon
Saining Xie
    SyDa
ArXivPDFHTML

Papers citing "ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders"

50 / 325 papers shown
Title
Maintaining Structural Integrity in Parameter Spaces for Parameter Efficient Fine-tuning
Maintaining Structural Integrity in Parameter Spaces for Parameter Efficient Fine-tuning
Chongjie Si
Xuehui Wang
Xue Yang
Zhengqin Xu
Qingyun Li
Jifeng Dai
Yu Qiao
Xiaokang Yang
Wei Shen
31
8
0
23 May 2024
Infinite-Dimensional Feature Interaction
Infinite-Dimensional Feature Interaction
Chenhui Xu
Fuxun Yu
Maoliang Li
Zihao Zheng
Zirui Xu
Jinjun Xiong
Xiang Chen
34
1
0
22 May 2024
Unsupervised Pre-training with Language-Vision Prompts for Low-Data
  Instance Segmentation
Unsupervised Pre-training with Language-Vision Prompts for Low-Data Instance Segmentation
Dingwen Zhang
Hao Li
Diqi He
Nian Liu
Lechao Cheng
Jingdong Wang
Junwei Han
VLM
43
0
0
22 May 2024
MiniMaxAD: A Lightweight Autoencoder for Feature-Rich Anomaly Detection
MiniMaxAD: A Lightweight Autoencoder for Feature-Rich Anomaly Detection
Feng Wang
Chengming Liu
Lei Shi
Haibo Pang
34
1
0
16 May 2024
PillarNeXt: Improving the 3D detector by introducing Voxel2Pillar
  feature encoding and extracting multi-scale features
PillarNeXt: Improving the 3D detector by introducing Voxel2Pillar feature encoding and extracting multi-scale features
Xusheng Li
Chengliang Wang
Shumao Wang
Zhuo Zeng
Ji Liu
3DPC
32
0
0
16 May 2024
FORESEE: Multimodal and Multi-view Representation Learning for Robust
  Prediction of Cancer Survival
FORESEE: Multimodal and Multi-view Representation Learning for Robust Prediction of Cancer Survival
Liangrui Pan
Yijun Peng
Yan Li
Yiyi Liang
Liwen Xu
Qingchun Liang
Shaoliang Peng
34
0
0
13 May 2024
Open Challenges and Opportunities in Federated Foundation Models Towards
  Biomedical Healthcare
Open Challenges and Opportunities in Federated Foundation Models Towards Biomedical Healthcare
Xingyu Li
Lu Peng
Yuping Wang
Weihua Zhang
AI4CE
MedIm
LM&MA
66
5
0
10 May 2024
MMEarth: Exploring Multi-Modal Pretext Tasks For Geospatial
  Representation Learning
MMEarth: Exploring Multi-Modal Pretext Tasks For Geospatial Representation Learning
Vishal Nedungadi
A. Kariryaa
Stefan Oehmcke
Serge J. Belongie
Christian Igel
Nico Lang
37
25
0
04 May 2024
A separability-based approach to quantifying generalization: which layer
  is best?
A separability-based approach to quantifying generalization: which layer is best?
Luciano Dyballa
Evan Gerritz
Steven W. Zucker
OOD
26
3
0
02 May 2024
MiPa: Mixed Patch Infrared-Visible Modality Agnostic Object Detection
MiPa: Mixed Patch Infrared-Visible Modality Agnostic Object Detection
H. R. Medeiros
David Latortue
Fidel Alejandro Guerrero Peña
Eric Granger
M. Pedersoli
19
0
0
29 Apr 2024
NTIRE 2024 Quality Assessment of AI-Generated Content Challenge
NTIRE 2024 Quality Assessment of AI-Generated Content Challenge
Xiaohong Liu
Xiongkuo Min
Guangtao Zhai
Chunyi Li
Tengchuan Kou
...
Qi Yan
Youran Qu
Xiaohui Zeng
Lele Wang
Renjie Liao
50
29
0
25 Apr 2024
CKGConv: General Graph Convolution with Continuous Kernels
CKGConv: General Graph Convolution with Continuous Kernels
Liheng Ma
Soumyasundar Pal
Yitian Zhang
Jiaming Zhou
Yingxue Zhang
Mark J. Coates
37
3
0
21 Apr 2024
Dynamic in Static: Hybrid Visual Correspondence for Self-Supervised
  Video Object Segmentation
Dynamic in Static: Hybrid Visual Correspondence for Self-Supervised Video Object Segmentation
Gensheng Pei
Yazhou Yao
Jianbo Jiao
Wenguan Wang
Liqiang Nie
Jinhui Tang
VOS
32
1
0
21 Apr 2024
An Experimental Study on Exploring Strong Lightweight Vision
  Transformers via Masked Image Modeling Pre-Training
An Experimental Study on Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training
Jin Gao
Shubo Lin
Shaoru Wang
Yutong Kou
Zeming Li
Liang Li
Congxuan Zhang
Xiaoqin Zhang
Yizheng Wang
Weiming Hu
39
1
0
18 Apr 2024
NTIRE 2024 Challenge on Short-form UGC Video Quality Assessment: Methods
  and Results
NTIRE 2024 Challenge on Short-form UGC Video Quality Assessment: Methods and Results
Xin Li
Kun Yuan
Yajing Pei
Yiting Lu
Ming-hui Sun
...
Kele Xu
Qisheng Xu
Tao Sun
Zhi-Guo Ding
Yuhan Hu
46
23
0
17 Apr 2024
Contextrast: Contextual Contrastive Learning for Semantic Segmentation
Contextrast: Contextual Contrastive Learning for Semantic Segmentation
Chan-Yong Sung
Wanhee Kim
Jungho An
Wooju Lee
Hyungtae Lim
Hyun Myung
39
12
0
16 Apr 2024
XoFTR: Cross-modal Feature Matching Transformer
XoFTR: Cross-modal Feature Matching Transformer
Önder Tuzcuoglu
Aybora Köksal
Bugra Sofu
Sinan Kalkan
Aydin Alatan
ViT
45
10
0
15 Apr 2024
Probing the 3D Awareness of Visual Foundation Models
Probing the 3D Awareness of Visual Foundation Models
Mohamed El Banani
Amit Raj
Kevis-Kokitsi Maninis
Abhishek Kar
Yuanzhen Li
Michael Rubinstein
Deqing Sun
Leonidas J. Guibas
Justin Johnson
Varun Jampani
35
79
0
12 Apr 2024
Masked Image Modeling as a Framework for Self-Supervised Learning across
  Eye Movements
Masked Image Modeling as a Framework for Self-Supervised Learning across Eye Movements
Robin Weiler
Matthias Brucklacher
C. Pennartz
Sander M. Bohté
31
0
0
12 Apr 2024
Adapting LLaMA Decoder to Vision Transformer
Adapting LLaMA Decoder to Vision Transformer
Jiahao Wang
Wenqi Shao
Mengzhao Chen
Chengyue Wu
Yong Liu
Taiqiang Wu
Kaipeng Zhang
Songyang Zhang
Kai-xiang Chen
Ping Luo
MLLM
38
4
0
10 Apr 2024
Pneumonia App: a mobile application for efficient pediatric pneumonia
  diagnosis using explainable convolutional neural networks (CNN)
Pneumonia App: a mobile application for efficient pediatric pneumonia diagnosis using explainable convolutional neural networks (CNN)
Jiaming Deng
Zhenglin Chen
Minjiang Chen
Lulu Xu
Jiaqi Yang
Zhendong Luo
Peiwu Qin
46
2
0
31 Mar 2024
Jointly Training and Pruning CNNs via Learnable Agent Guidance and
  Alignment
Jointly Training and Pruning CNNs via Learnable Agent Guidance and Alignment
Alireza Ganjdanesh
Shangqian Gao
Heng-Chiao Huang
36
5
0
28 Mar 2024
SGDM: Static-Guided Dynamic Module Make Stronger Visual Models
SGDM: Static-Guided Dynamic Module Make Stronger Visual Models
Wenjie Xing
Zhenchao Cui
Jing Qi
32
0
0
27 Mar 2024
QuakeSet: A Dataset and Low-Resource Models to Monitor Earthquakes
  through Sentinel-1
QuakeSet: A Dataset and Low-Resource Models to Monitor Earthquakes through Sentinel-1
Daniele Rege Cambrin
Paolo Garza
22
6
0
26 Mar 2024
Tiny Models are the Computational Saver for Large Models
Tiny Models are the Computational Saver for Large Models
Qingyuan Wang
B. Cardiff
Antoine Frappé
Benoît Larras
Deepu John
29
2
0
26 Mar 2024
Deep Learning for Segmentation of Cracks in High-Resolution Images of
  Steel Bridges
Deep Learning for Segmentation of Cracks in High-Resolution Images of Steel Bridges
Andrii Kompanets
Gautam Pai
R. Duits
Davide Leonetti
Bert Snijder
33
1
0
26 Mar 2024
PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition
PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition
Chenhongyi Yang
Zehui Chen
Miguel Espinosa
Linus Ericsson
Zhenyu Wang
Jiaming Liu
Elliot J. Crowley
Mamba
26
86
0
26 Mar 2024
PaPr: Training-Free One-Step Patch Pruning with Lightweight ConvNets for
  Faster Inference
PaPr: Training-Free One-Step Patch Pruning with Lightweight ConvNets for Faster Inference
Tanvir Mahmud
Burhaneddin Yaman
Chun-Hao Liu
Diana Marculescu
38
2
0
24 Mar 2024
Finding needles in a haystack: A Black-Box Approach to Invisible
  Watermark Detection
Finding needles in a haystack: A Black-Box Approach to Invisible Watermark Detection
Minzhou Pan
Zhengting Wang
Xin Dong
Vikash Sehwag
Lingjuan Lyu
Xue Lin
38
3
0
23 Mar 2024
ParFormer: Vision Transformer Baseline with Parallel Local Global Token
  Mixer and Convolution Attention Patch Embedding
ParFormer: Vision Transformer Baseline with Parallel Local Global Token Mixer and Convolution Attention Patch Embedding
Novendra Setyawan
Ghufron Wahyu Kurniawan
Chi-Chia Sun
Jun-Wei Hsieh
Hui-Kai Su
W. Kuo
ViT
MoE
29
0
0
22 Mar 2024
Improve Cross-domain Mixed Sampling with Guidance Training for Adaptive
  Segmentation
Improve Cross-domain Mixed Sampling with Guidance Training for Adaptive Segmentation
Wenlve Zhou
Zhiheng Zhou
Tianlei Wang
Delu Zeng
35
0
0
22 Mar 2024
OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation
OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation
Bohao Peng
Xiaoyang Wu
Li Jiang
Yukang Chen
Hengshuang Zhao
Zhuotao Tian
Jiaya Jia
48
17
0
21 Mar 2024
Unifying Local and Global Multimodal Features for Place Recognition in
  Aliased and Low-Texture Environments
Unifying Local and Global Multimodal Features for Place Recognition in Aliased and Low-Texture Environments
Alberto García-Hernández
Riccardo Giubilato
Klaus H. Strobl
Javier Civera
Rudolph Triebel
37
2
0
20 Mar 2024
LUWA Dataset: Learning Lithic Use-Wear Analysis on Microscopic Images
LUWA Dataset: Learning Lithic Use-Wear Analysis on Microscopic Images
Jing Zhang
Irving Fang
Juexiao Zhang
Hao Wu
Akshat Kaushik
Alice Rodriguez
Hanwen Zhao
Zhuo Zheng
Radu Iovita
Chen Feng
22
3
0
19 Mar 2024
HIRI-ViT: Scaling Vision Transformer with High Resolution Inputs
HIRI-ViT: Scaling Vision Transformer with High Resolution Inputs
Ting Yao
Yehao Li
Yingwei Pan
Tao Mei
ViT
23
15
0
18 Mar 2024
LSKNet: A Foundation Lightweight Backbone for Remote Sensing
LSKNet: A Foundation Lightweight Backbone for Remote Sensing
Yuxuan Li
Xiang Li
Yimain Dai
Qibin Hou
Li Liu
Yongxiang Liu
Ming-Ming Cheng
Jian Yang
34
31
0
18 Mar 2024
PARMESAN: Parameter-Free Memory Search and Transduction for Dense Prediction Tasks
PARMESAN: Parameter-Free Memory Search and Transduction for Dense Prediction Tasks
Philip Matthias Winter
M. Wimmer
David Major
Dimitrios Lenis
Astrid Berg
Theresa Neubauer
Gaia Romana De Paolis
Johannes Novotny
Sophia Ulonska
Katja Bühler
34
0
0
18 Mar 2024
NeoNeXt: Novel neural network operator and architecture based on the
  patch-wise matrix multiplications
NeoNeXt: Novel neural network operator and architecture based on the patch-wise matrix multiplications
Vladimir Korviakov
Denis Koposov
31
0
0
17 Mar 2024
EfficientMorph: Parameter-Efficient Transformer-Based Architecture for
  3D Image Registration
EfficientMorph: Parameter-Efficient Transformer-Based Architecture for 3D Image Registration
Abu Zahid Bin Aziz
Mokshagna Sai Teja Karanam
Tushar Kataria
Shireen Elhabian
ViT
MedIm
29
1
0
16 Mar 2024
D-Net: Dynamic Large Kernel with Dynamic Feature Fusion for Volumetric
  Medical Image Segmentation
D-Net: Dynamic Large Kernel with Dynamic Feature Fusion for Volumetric Medical Image Segmentation
Jin Yang
Peijie Qiu
Yichi Zhang
Daniel S. Marcus
Aristeidis Sotiras
MedIm
36
9
0
15 Mar 2024
Depth-induced Saliency Comparison Network for Diagnosis of Alzheimer's
  Disease via Jointly Analysis of Visual Stimuli and Eye Movements
Depth-induced Saliency Comparison Network for Diagnosis of Alzheimer's Disease via Jointly Analysis of Visual Stimuli and Eye Movements
Yu Liu
Wenlin Zhang
Shaochu Wang
Fangyu Zuo
Peiguang Jing
Yong Ji
27
0
0
15 Mar 2024
SELECTOR: Heterogeneous graph network with convolutional masked
  autoencoder for multimodal robust prediction of cancer survival
SELECTOR: Heterogeneous graph network with convolutional masked autoencoder for multimodal robust prediction of cancer survival
Liangrui Pan
Yijun Peng
Yan Li
Xiang Wang
Wenjuan Liu
Liwen Xu
Qingchun Liang
Shaoliang Peng
35
3
0
14 Mar 2024
Not just Birds and Cars: Generic, Scalable and Explainable Models for
  Professional Visual Recognition
Not just Birds and Cars: Generic, Scalable and Explainable Models for Professional Visual Recognition
Junde Wu
Jiayuan Zhu
Min Xu
Yueming Jin
30
0
0
08 Mar 2024
LVIC: Multi-modality segmentation by Lifting Visual Info as Cue
LVIC: Multi-modality segmentation by Lifting Visual Info as Cue
Zichao Dong
Bowen Pang
Xufeng Huang
Hang Ji
Xin Zhan
Junbo Chen
3DPC
35
0
0
08 Mar 2024
RFWave: Multi-band Rectified Flow for Audio Waveform Reconstruction
RFWave: Multi-band Rectified Flow for Audio Waveform Reconstruction
Peng Liu
Dongyang Dai
Zhiyong Wu
24
2
0
08 Mar 2024
Select High-Level Features: Efficient Experts from a Hierarchical
  Classification Network
Select High-Level Features: Efficient Experts from a Hierarchical Classification Network
A. Kelm
Niels Hannemann
Bruno Heberle
Lucas Schmidt
Tim Rolff
Christian Wilms
Ehsan Yaghoubi
Simone Frintrop
21
0
0
08 Mar 2024
RadarDistill: Boosting Radar-based Object Detection Performance via Knowledge Distillation from LiDAR Features
RadarDistill: Boosting Radar-based Object Detection Performance via Knowledge Distillation from LiDAR Features
Geonho Bang
Kwangjin Choi
Jisong Kim
Dongsuk Kum
Jun Won Choi
38
13
0
08 Mar 2024
ACC-ViT : Atrous Convolution's Comeback in Vision Transformers
ACC-ViT : Atrous Convolution's Comeback in Vision Transformers
Nabil Ibtehaz
Ning Yan
Masood S. Mortazavi
Daisuke Kihara
ViT
19
3
0
07 Mar 2024
HyenaPixel: Global Image Context with Convolutions
HyenaPixel: Global Image Context with Convolutions
Julian Spravil
Sebastian Houben
Sven Behnke
29
1
0
29 Feb 2024
VideoMAC: Video Masked Autoencoders Meet ConvNets
VideoMAC: Video Masked Autoencoders Meet ConvNets
Gensheng Pei
Tao Chen
XiRuo Jiang
Huafeng Liu
Zeren Sun
Yazhou Yao
VGen
36
9
0
29 Feb 2024
Previous
1234567
Next