ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2301.00808
  4. Cited By
ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders

ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders

2 January 2023
Sanghyun Woo
Shoubhik Debnath
Ronghang Hu
Xinlei Chen
Zhuang Liu
In So Kweon
Saining Xie
    SyDa
ArXivPDFHTML

Papers citing "ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders"

50 / 325 papers shown
Title
SwitchLight: Co-design of Physics-driven Architecture and Pre-training
  Framework for Human Portrait Relighting
SwitchLight: Co-design of Physics-driven Architecture and Pre-training Framework for Human Portrait Relighting
Hoon Kim
Minje Jang
Wonjun Yoon
Jisoo Lee
Donghyun Na
Sanghyun Woo
AI4CE
36
19
0
29 Feb 2024
OccTransformer: Improving BEVFormer for 3D camera-only occupancy
  prediction
OccTransformer: Improving BEVFormer for 3D camera-only occupancy prediction
Jian Liu
Sipeng Zhang
Chuixin Kong
Wenyuan Zhang
Yuhang Wu
Yikang Ding
Borun Xu
Ruibo Ming
Dong-Lai Wei
Xianming Liu
28
7
0
28 Feb 2024
Divide-Conquer-and-Merge: Memory- and Time-Efficient Holographic
  Displays
Divide-Conquer-and-Merge: Memory- and Time-Efficient Holographic Displays
Zhenxing Dong
Jidong Jia
Yan Li
Yuye Ling
23
0
0
25 Feb 2024
Sequential Visual and Semantic Consistency for Semi-supervised Text
  Recognition
Sequential Visual and Semantic Consistency for Semi-supervised Text Recognition
Mingkun Yang
Biao Yang
Minghui Liao
Yingying Zhu
Xiang Bai
32
5
0
24 Feb 2024
Zero-shot generalization across architectures for visual classification
Zero-shot generalization across architectures for visual classification
Evan Gerritz
Luciano Dyballa
Steven W. Zucker
29
1
0
21 Feb 2024
Class-Aware Mask-Guided Feature Refinement for Scene Text Recognition
Class-Aware Mask-Guided Feature Refinement for Scene Text Recognition
Mingkun Yang
Biao Yang
Minghui Liao
Yingying Zhu
X. Bai
VLM
72
10
0
21 Feb 2024
YOLOv9: Learning What You Want to Learn Using Programmable Gradient
  Information
YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information
Chien-Yao Wang
I-Hau Yeh
Hongpeng Liao
46
1,148
0
21 Feb 2024
YOLO-Ant: A Lightweight Detector via Depthwise Separable Convolutional
  and Large Kernel Design for Antenna Interference Source Detection
YOLO-Ant: A Lightweight Detector via Depthwise Separable Convolutional and Large Kernel Design for Antenna Interference Source Detection
Xiaoyu Tang
Xingming Chen
Jintao Cheng
Jin Wu
Rui Fan
Chengxi Zhang
Zebo Zhou
21
4
0
20 Feb 2024
Universal Physics Transformers: A Framework For Efficiently Scaling Neural Operators
Universal Physics Transformers: A Framework For Efficiently Scaling Neural Operators
Benedikt Alkin
Andreas Fürst
Simon Schmid
Lukas Gruber
Markus Holzleitner
Johannes Brandstetter
PINN
AI4CE
42
8
0
19 Feb 2024
APCodec: A Neural Audio Codec with Parallel Amplitude and Phase Spectrum
  Encoding and Decoding
APCodec: A Neural Audio Codec with Parallel Amplitude and Phase Spectrum Encoding and Decoding
Yang Ai
Xiao-Hang Jiang
Ye-Xin Lu
Hui-Peng Du
Zhenhua Ling
21
20
0
16 Feb 2024
Leveraging Self-Supervised Instance Contrastive Learning for Radar
  Object Detection
Leveraging Self-Supervised Instance Contrastive Learning for Radar Object Detection
Colin Decourt
R. V. Rullen
D. Salle
Thomas Oberlin
SSL
33
0
0
13 Feb 2024
Neural Networks Learn Statistics of Increasing Complexity
Neural Networks Learn Statistics of Increasing Complexity
Nora Belrose
Quintin Pope
Lucia Quirke
Alex Troy Mallen
Xiaoli Z. Fern
16
11
0
06 Feb 2024
EscherNet: A Generative Model for Scalable View Synthesis
EscherNet: A Generative Model for Scalable View Synthesis
Xin Kong
Shikun Liu
Xiaoyang Lyu
Marwan Taher
Xiaojuan Qi
Andrew J. Davison
DiffM
78
42
0
06 Feb 2024
MixedNUTS: Training-Free Accuracy-Robustness Balance via Nonlinearly
  Mixed Classifiers
MixedNUTS: Training-Free Accuracy-Robustness Balance via Nonlinearly Mixed Classifiers
Yatong Bai
Mo Zhou
Vishal M. Patel
Somayeh Sojoudi
AAML
19
6
0
03 Feb 2024
A Survey on Self-Supervised Learning for Non-Sequential Tabular Data
A Survey on Self-Supervised Learning for Non-Sequential Tabular Data
Wei-Yao Wang
Wei-Wei Du
Derek Xu
Wei Wang
Wenjie Peng
LMTD
30
7
0
02 Feb 2024
MouSi: Poly-Visual-Expert Vision-Language Models
MouSi: Poly-Visual-Expert Vision-Language Models
Xiaoran Fan
Tao Ji
Changhao Jiang
Shuo Li
Senjie Jin
...
Qi Zhang
Xipeng Qiu
Xuanjing Huang
Zuxuan Wu
Yunchun Jiang
VLM
24
16
0
30 Jan 2024
Category-wise Fine-Tuning: Resisting Incorrect Pseudo-Labels in
  Multi-Label Image Classification with Partial Labels
Category-wise Fine-Tuning: Resisting Incorrect Pseudo-Labels in Multi-Label Image Classification with Partial Labels
Chak Fong Chong
Xinyi Fang
Jielong Guo
Yapeng Wang
Wei Ke
C. Lam
Sio-Kei Im
23
1
0
30 Jan 2024
Unveiling the Unseen: Identifiable Clusters in Trained Depthwise
  Convolutional Kernels
Unveiling the Unseen: Identifiable Clusters in Trained Depthwise Convolutional Kernels
Z. Babaiee
Peyman M. Kiasari
Daniela Rus
Radu Grosu
33
1
0
25 Jan 2024
WAL-Net: Weakly supervised auxiliary task learning network for carotid
  plaques classification
WAL-Net: Weakly supervised auxiliary task learning network for carotid plaques classification
Haitao Gan
Lingchao Fu
Ran Zhou
Weiyan Gan
Furong Wang
Xiaoyan Wu
Zhi Yang
Zhongwei Huang
26
2
0
25 Jan 2024
Neural Echos: Depthwise Convolutional Filters Replicate Biological
  Receptive Fields
Neural Echos: Depthwise Convolutional Filters Replicate Biological Receptive Fields
Z. Babaiee
Peyman M. Kiasari
Daniela Rus
Radu Grosu
MDE
11
3
0
18 Jan 2024
UPDP: A Unified Progressive Depth Pruner for CNN and Vision Transformer
UPDP: A Unified Progressive Depth Pruner for CNN and Vision Transformer
Ji Liu
Dehua Tang
Yuanxian Huang
Li Lyna Zhang
Xiaocheng Zeng
...
Jinzhang Peng
Yu-Chiang Frank Wang
Fan Jiang
Lu Tian
Ashish Sirasao
ViT
24
7
0
12 Jan 2024
Do Vision and Language Encoders Represent the World Similarly?
Do Vision and Language Encoders Represent the World Similarly?
Mayug Maniparambil
Raiymbek Akshulakov
Y. A. D. Djilali
Sanath Narayan
M. Seddik
K. Mangalam
Noel E. O'Connor
VLM
21
11
0
10 Jan 2024
Uni3D-LLM: Unifying Point Cloud Perception, Generation and Editing with
  Large Language Models
Uni3D-LLM: Unifying Point Cloud Perception, Generation and Editing with Large Language Models
Dingning Liu
Xiaoshui Huang
Yuenan Hou
Zhihui Wang
Zhen-fei Yin
Yongshun Gong
Peng Gao
Wanli Ouyang
19
8
0
09 Jan 2024
ChartAssisstant: A Universal Chart Multimodal Language Model via
  Chart-to-Table Pre-training and Multitask Instruction Tuning
ChartAssisstant: A Universal Chart Multimodal Language Model via Chart-to-Table Pre-training and Multitask Instruction Tuning
Fanqing Meng
Wenqi Shao
Quanfeng Lu
Peng Gao
Kaipeng Zhang
Yu Qiao
Ping Luo
27
45
0
04 Jan 2024
Explore Human Parsing Modality for Action Recognition
Explore Human Parsing Modality for Action Recognition
Jinfu Liu
Runwei Ding
Yuhang Wen
Nan Dai
Fanyang Meng
Shen Zhao
Mengyuan Liu
25
7
0
04 Jan 2024
Masked Modeling for Self-supervised Representation Learning on Vision
  and Beyond
Masked Modeling for Self-supervised Representation Learning on Vision and Beyond
Siyuan Li
Luyuan Zhang
Zedong Wang
Di Wu
Lirong Wu
...
Jun-Xiong Xia
Cheng Tan
Yang Liu
Baigui Sun
Stan Z. Li
SSL
33
14
0
31 Dec 2023
NeXt-TDNN: Modernizing Multi-Scale Temporal Convolution Backbone for
  Speaker Verification
NeXt-TDNN: Modernizing Multi-Scale Temporal Convolution Backbone for Speaker Verification
Hyunjun Heo
U.H Shin
Ran Lee
YoungJu Cheon
Hyung-Min Park
23
9
0
14 Dec 2023
Learned representation-guided diffusion models for large-image
  generation
Learned representation-guided diffusion models for large-image generation
Alexandros Graikos
Srikar Yellapragada
Minh-Quan Le
S. Kapse
Prateek Prasanna
Joel H. Saltz
Dimitris Samaras
DiffM
27
26
0
12 Dec 2023
MIMIR: Masked Image Modeling for Mutual Information-based Adversarial Robustness
MIMIR: Masked Image Modeling for Mutual Information-based Adversarial Robustness
Xiaoyun Xu
Shujian Yu
Jingzheng Wu
S. Picek
AAML
35
0
0
08 Dec 2023
Camera Height Doesn't Change: Unsupervised Training for Metric Monocular
  Road-Scene Depth Estimation
Camera Height Doesn't Change: Unsupervised Training for Metric Monocular Road-Scene Depth Estimation
Genki Kinoshita
Ko Nishino
MDE
21
1
0
07 Dec 2023
Foveation in the Era of Deep Learning
Foveation in the Era of Deep Learning
George Killick
Paul Henderson
Paul Siebert
Gerardo Aragon Camarasa
FedML
35
1
0
03 Dec 2023
SparseDC: Depth Completion from sparse and non-uniform inputs
SparseDC: Depth Completion from sparse and non-uniform inputs
Chen Long
Wenxiao Zhang
Zhe Chen
Haiping Wang
Yuan-Bin Liu
Zhen Cao
Zhen Dong
Bisheng Yang
MDE
30
8
0
30 Nov 2023
A Simple Video Segmenter by Tracking Objects Along Axial Trajectories
A Simple Video Segmenter by Tracking Objects Along Axial Trajectories
Ju He
Qihang Yu
Inkyu Shin
XueQing Deng
Alan L. Yuille
Xiaohui Shen
Liang-Chieh Chen
VOS
30
2
0
30 Nov 2023
DiG-IN: Diffusion Guidance for Investigating Networks -- Uncovering
  Classifier Differences Neuron Visualisations and Visual Counterfactual
  Explanations
DiG-IN: Diffusion Guidance for Investigating Networks -- Uncovering Classifier Differences Neuron Visualisations and Visual Counterfactual Explanations
Maximilian Augustin
Yannic Neuhaus
Matthias Hein
DiffM
29
4
0
29 Nov 2023
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio,
  Video, Point Cloud, Time-Series and Image Recognition
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
Xiaohan Ding
Yiyuan Zhang
Yixiao Ge
Sijie Zhao
Lin Song
Xiangyu Yue
Ying Shan
VLM
AI4TS
SSL
21
100
0
27 Nov 2023
PhytNet -- Tailored Convolutional Neural Networks for Custom Botanical
  Data
PhytNet -- Tailored Convolutional Neural Networks for Custom Botanical Data
Jamie R. Sykes
Katherine Denby
Daniel W. Franks
17
1
0
20 Nov 2023
Pair-wise Layer Attention with Spatial Masking for Video Prediction
Pair-wise Layer Attention with Spatial Masking for Video Prediction
Ping Li
Chenhan Zhang
Zheng Yang
Xianghua Xu
Mingli Song
19
0
0
19 Nov 2023
ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy
ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy
Kirill Vishniakov
Zhiqiang Shen
Zhuang Liu
CLIP
40
16
0
15 Nov 2023
MD-IQA: Learning Multi-scale Distributed Image Quality Assessment with
  Semi Supervised Learning for Low Dose CT
MD-IQA: Learning Multi-scale Distributed Image Quality Assessment with Semi Supervised Learning for Low Dose CT
Tao Song
Ruizhi Hou
Lisong Dai
Lei Xiang
18
4
0
14 Nov 2023
SynthEnsemble: A Fusion of CNN, Vision Transformer, and Hybrid Models
  for Multi-Label Chest X-Ray Classification
SynthEnsemble: A Fusion of CNN, Vision Transformer, and Hybrid Models for Multi-Label Chest X-Ray Classification
S. M. N. Ashraf
Md. Adyelullahil Mamun
Hasnat Md. Abdullah
Rabiul Alam
ViT
MedIm
8
7
0
13 Nov 2023
SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for
  Multi-modal Large Language Models
SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models
Ziyi Lin
Chris Liu
Renrui Zhang
Peng Gao
Longtian Qiu
...
Siyuan Huang
Yichi Zhang
Xuming He
Hongsheng Li
Yu Qiao
MLLM
VLM
33
208
0
13 Nov 2023
Florence-2: Advancing a Unified Representation for a Variety of Vision
  Tasks
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
Bin Xiao
Haiping Wu
Weijian Xu
Xiyang Dai
Houdong Hu
Yumao Lu
Michael Zeng
Ce Liu
Lu Yuan
VLM
36
143
0
10 Nov 2023
Learning Discriminative Features for Crowd Counting
Learning Discriminative Features for Crowd Counting
Yuehai Chen
Qingzhong Wang
Jing Yang
Badong Chen
Haoyi Xiong
Shaoyi Du
25
6
0
08 Nov 2023
FATE: Feature-Agnostic Transformer-based Encoder for learning
  generalized embedding spaces in flow cytometry data
FATE: Feature-Agnostic Transformer-based Encoder for learning generalized embedding spaces in flow cytometry data
Lisa Weijler
Florian Kowarsch
Michael Reiter
Pedro Hermosilla
Margarita Maurer-Granofszky
Michael N. Dworzak
MedIm
11
2
0
06 Nov 2023
Multi-view learning for automatic classification of multi-wavelength
  auroral images
Multi-view learning for automatic classification of multi-wavelength auroral images
Qiuju Yang
Hang Su
Lili Liu
Yixuan Wang
Ze-Jun Hu
16
2
0
06 Nov 2023
Towards Evaluating Transfer-based Attacks Systematically, Practically,
  and Fairly
Towards Evaluating Transfer-based Attacks Systematically, Practically, and Fairly
Qizhang Li
Yiwen Guo
Wangmeng Zuo
Hao Chen
ELM
AAML
33
2
0
02 Nov 2023
H-NeXt: The next step towards roto-translation invariant networks
H-NeXt: The next step towards roto-translation invariant networks
Tomáš Karella
F. Šroubek
J. Flusser
Jan Blazek
Vasek Kosik
29
1
0
02 Nov 2023
CROMA: Remote Sensing Representations with Contrastive Radar-Optical
  Masked Autoencoders
CROMA: Remote Sensing Representations with Contrastive Radar-Optical Masked Autoencoders
A. Fuller
K. Millard
James R. Green
17
60
0
01 Nov 2023
TorchDEQ: A Library for Deep Equilibrium Models
TorchDEQ: A Library for Deep Equilibrium Models
Zhengyang Geng
J. Zico Kolter
VLM
52
12
0
28 Oct 2023
Towards Learning Monocular 3D Object Localization From 2D Labels using
  the Physical Laws of Motion
Towards Learning Monocular 3D Object Localization From 2D Labels using the Physical Laws of Motion
Daniel Kienzle
Julian Lorenz
K. Ludwig
Rainer Lienhart
28
1
0
26 Oct 2023
Previous
1234567
Next