ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2301.00808
  4. Cited By
ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders

ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders

2 January 2023
Sanghyun Woo
Shoubhik Debnath
Ronghang Hu
Xinlei Chen
Zhuang Liu
In So Kweon
Saining Xie
    SyDa
ArXivPDFHTML

Papers citing "ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders"

50 / 325 papers shown
Title
Parallel Sequence Modeling via Generalized Spatial Propagation Network
Parallel Sequence Modeling via Generalized Spatial Propagation Network
Hongjun Wang
Wonmin Byeon
Jiarui Xu
Jinwei Gu
Ka Chun Cheung
Xiaolong Wang
Kai Han
Jan Kautz
Sifei Liu
135
0
0
21 Jan 2025
Progressive Cross Attention Network for Flood Segmentation using Multispectral Satellite Imagery
Progressive Cross Attention Network for Flood Segmentation using Multispectral Satellite Imagery
Vicky Feliren
Fithrothul Khikmah
Irfan Dwiki Bhaswara
Bahrul I. Nasution
Alex M. Lechner
Muhamad Risqi U. Saputra
32
2
0
21 Jan 2025
UNet--: Memory-Efficient and Feature-Enhanced Network Architecture based
  on U-Net with Reduced Skip-Connections
UNet--: Memory-Efficient and Feature-Enhanced Network Architecture based on U-Net with Reduced Skip-Connections
Lingxiao Yin
Wei Tao
Dongyue Zhao
Tadayuki Ito
Kinya Osa
Masami Kato
Tse-Wei Chen
31
0
0
24 Dec 2024
MATCHED: Multimodal Authorship-Attribution To Combat Human Trafficking
  in Escort-Advertisement Data
MATCHED: Multimodal Authorship-Attribution To Combat Human Trafficking in Escort-Advertisement Data
V. Saxena
Benjamin Bashpole
Gijs van Dijck
Gerasimos Spanakis
72
0
0
18 Dec 2024
S2S2: Semantic Stacking for Robust Semantic Segmentation in Medical
  Imaging
S2S2: Semantic Stacking for Robust Semantic Segmentation in Medical Imaging
Yimu Pan
Sitao Zhang
Alison D. Gernand
Jeffery A. Goldstein
J. Z. Wang
72
1
0
17 Dec 2024
SAMIC: Segment Anything with In-Context Spatial Prompt Engineering
SAMIC: Segment Anything with In-Context Spatial Prompt Engineering
S. Nagendra
Kashif Rashid
Chaopeng Shen
Daniel Kifer
VLM
71
2
0
16 Dec 2024
Multilabel Classification for Lung Disease Detection: Integrating Deep
  Learning and Natural Language Processing
Multilabel Classification for Lung Disease Detection: Integrating Deep Learning and Natural Language Processing
Maria Efimovich
Jayden Lim
Vedant Mehta
Ethan Poon
71
0
0
16 Dec 2024
MOVIS: Enhancing Multi-Object Novel View Synthesis for Indoor Scenes
MOVIS: Enhancing Multi-Object Novel View Synthesis for Indoor Scenes
Ruijie Lu
Yixin Chen
Junfeng Ni
Baoxiong Jia
Yu Liu
Diwen Wan
Gang Zeng
Siyuan Huang
DiffM
127
4
0
16 Dec 2024
ASANet: Asymmetric Semantic Aligning Network for RGB and SAR image land
  cover classification
ASANet: Asymmetric Semantic Aligning Network for RGB and SAR image land cover classification
Pan Zhang
Baochai Peng
Chaoran Lu
Quanjin Huang
71
1
0
03 Dec 2024
I Spy With My Little Eye: A Minimum Cost Multicut Investigation of
  Dataset Frames
I Spy With My Little Eye: A Minimum Cost Multicut Investigation of Dataset Frames
Katharina Prasse
Isaac Bravo
Stefanie Walter
M. Keuper
67
1
0
02 Dec 2024
Token Cropr: Faster ViTs for Quite a Few Tasks
Token Cropr: Faster ViTs for Quite a Few Tasks
Benjamin Bergner
C. Lippert
Aravindh Mahendran
ViT
VLM
69
0
0
01 Dec 2024
Scaling Transformers for Low-Bitrate High-Quality Speech Coding
Scaling Transformers for Low-Bitrate High-Quality Speech Coding
Julian Parker
Anton Smirnov
Jordi Pons
CJ Carr
Zack Zukowski
Zach Evans
Xubo Liu
75
9
0
29 Nov 2024
Scaling Spike-driven Transformer with Efficient Spike Firing
  Approximation Training
Scaling Spike-driven Transformer with Efficient Spike Firing Approximation Training
Man Yao
Xuerui Qiu
Tianxiang Hu
J. Hu
Yuhong Chou
Keyu Tian
Jianxing Liao
Luziwei Leng
Bo Xu
Guoqi Li
74
4
0
25 Nov 2024
SVTRv2: CTC Beats Encoder-Decoder Models in Scene Text Recognition
SVTRv2: CTC Beats Encoder-Decoder Models in Scene Text Recognition
Yongkun Du
Z. Chen
Hongtao Xie
Caiyan Jia
Yu Jiang
83
1
0
24 Nov 2024
Multi-Token Enhancing for Vision Representation Learning
Multi-Token Enhancing for Vision Representation Learning
Zhong-Yu Li
Yu-Song Hu
Bo Yin
Ming-Ming Cheng
66
1
0
24 Nov 2024
PR-MIM: Delving Deeper into Partial Reconstruction in Masked Image
  Modeling
PR-MIM: Delving Deeper into Partial Reconstruction in Masked Image Modeling
Zhong-Yu Li
Yunheng Li
Deng-Ping Fan
Ming-Ming Cheng
66
0
0
24 Nov 2024
EfficientViM: Efficient Vision Mamba with Hidden State Mixer based State Space Duality
EfficientViM: Efficient Vision Mamba with Hidden State Mixer based State Space Duality
Sanghyeok Lee
Joonmyung Choi
Hyunwoo J. Kim
110
3
0
22 Nov 2024
ESTVocoder: An Excitation-Spectral-Transformed Neural Vocoder Conditioned on Mel Spectrogram
Xiao-Hang Jiang
Hui-Peng Du
Yang Ai
Ye-Xin Lu
Zhen-Hua Ling
28
0
0
18 Nov 2024
SAMOS: A Neural MOS Prediction Model Leveraging Semantic Representations and Acoustic Features
Yu-Fei Shi
Yang Ai
Ye-Xin Lu
Hui-Peng Du
Zhen-Hua Ling
31
0
0
18 Nov 2024
Relational Contrastive Learning and Masked Image Modeling for Scene Text Recognition
T. Lin
Jinglei Zhang
Yi Xu
Kai Chen
Rui Zhang
C. L. P. Chen
38
0
0
18 Nov 2024
Emotional Images: Assessing Emotions in Images and Potential Biases in
  Generative Models
Emotional Images: Assessing Emotions in Images and Potential Biases in Generative Models
Maneet Mehta
Cody Buntain
EGVM
32
2
0
08 Nov 2024
GazeGen: Gaze-Driven User Interaction for Visual Content Generation
GazeGen: Gaze-Driven User Interaction for Visual Content Generation
He-Yen Hsieh
Ziyun Li
Sai Qian Zhang
W. Ting
Kao-Den Chang
B. D. Salvo
Chiao Liu
H. T. Kung
VGen
32
0
0
07 Nov 2024
MDCTCodec: A Lightweight MDCT-based Neural Audio Codec towards High
  Sampling Rate and Low Bitrate Scenarios
MDCTCodec: A Lightweight MDCT-based Neural Audio Codec towards High Sampling Rate and Low Bitrate Scenarios
Xiao-Hang Jiang
Yang Ai
Rui Zheng
Hui-Peng Du
Ye-Xin Lu
Zhen-Hua Ling
48
2
0
01 Nov 2024
MLLA-UNet: Mamba-like Linear Attention in an Efficient U-Shape Model for
  Medical Image Segmentation
MLLA-UNet: Mamba-like Linear Attention in an Efficient U-Shape Model for Medical Image Segmentation
Yufeng Jiang
Zongxi Li
Xiangyan Chen
Haoran Xie
Jing Cai
Mamba
37
1
0
31 Oct 2024
Decoupling Semantic Similarity from Spatial Alignment for Neural
  Networks
Decoupling Semantic Similarity from Spatial Alignment for Neural Networks
Tassilo Wald
Constantin Ulrich
Gregor Köhler
David Zimmerer
Stefan Denner
Michael Baumgartner
Fabian Isensee
Priyank Jaini
Klaus H. Maier-Hein
38
0
0
30 Oct 2024
APCodec+: A Spectrum-Coding-Based High-Fidelity and
  High-Compression-Rate Neural Audio Codec with Staged Training Paradigm
APCodec+: A Spectrum-Coding-Based High-Fidelity and High-Compression-Rate Neural Audio Codec with Staged Training Paradigm
Hui-Peng Du
Yang Ai
Rui Zheng
Zhen-Hua Ling
35
0
0
30 Oct 2024
Revisiting MAE pre-training for 3D medical image segmentation
Revisiting MAE pre-training for 3D medical image segmentation
Tassilo Wald
Constantin Ulrich
Stanislav Lukyanenko
Andrei Goncharov
Alberto Paderno
Leander Maerkisch
Paul F. Jäger
Paul F. Jäger
Klaus Maier-Hein
42
2
0
30 Oct 2024
Multi-Level Feature Distillation of Joint Teachers Trained on Distinct
  Image Datasets
Multi-Level Feature Distillation of Joint Teachers Trained on Distinct Image Datasets
Adrian Iordache
B. Alexe
Radu Tudor Ionescu
31
1
0
29 Oct 2024
Going Beyond H&E and Oncology: How Do Histopathology Foundation Models
  Perform for Multi-stain IHC and Immunology?
Going Beyond H&E and Oncology: How Do Histopathology Foundation Models Perform for Multi-stain IHC and Immunology?
Amaya Gallagher-Syed
Elena Pontarini
M. Lewis
Michael Barnes
Gregory Slabaugh
23
1
0
28 Oct 2024
Decoding Reading Goals from Eye Movements
Decoding Reading Goals from Eye Movements
Omer Shubi
Cfir Avraham Hadar
Yevgeni Berzak
AIMat
44
1
0
28 Oct 2024
Classifying Bicycle Infrastructure Using On-Bike Street-Level Images
Classifying Bicycle Infrastructure Using On-Bike Street-Level Images
Kal Backman
Ben Beck
Dana Kulić
19
0
0
24 Oct 2024
Scale Propagation Network for Generalizable Depth Completion
Scale Propagation Network for Generalizable Depth Completion
Haotian Wang
Meng Yang
Xinhu Zheng
Gang Hua
29
2
0
24 Oct 2024
Frontiers in Intelligent Colonoscopy
Frontiers in Intelligent Colonoscopy
Ge-Peng Ji
Jingyi Liu
Peng-Tao Xu
Nick Barnes
F. Khan
Salman Khan
Deng-Ping Fan
41
4
0
22 Oct 2024
When Graph meets Multimodal: Benchmarking and Meditating on Multimodal Attributed Graphs Learning
When Graph meets Multimodal: Benchmarking and Meditating on Multimodal Attributed Graphs Learning
Hao Yan
C. Li
Zhigang Yu
Jun Yin
Ruochen Liu
Peiyan Zhang
Weihao Han
Mingzheng Li
Zhengxin Zeng
24
0
0
11 Oct 2024
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow
  Matching
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching
Yushen Chen
Zhikang Niu
Ziyang Ma
Keqi Deng
Chunhui Wang
Jian Zhao
Kai Yu
Xie Chen
25
52
0
09 Oct 2024
Unveiling the Backbone-Optimizer Coupling Bias in Visual Representation
  Learning
Unveiling the Backbone-Optimizer Coupling Bias in Visual Representation Learning
Siyuan Li
Juanxi Tian
Zedong Wang
Luyuan Zhang
Zicheng Liu
Weiyang Jin
Yang Liu
Baigui Sun
Stan Z. Li
29
0
0
08 Oct 2024
Stage-Wise and Prior-Aware Neural Speech Phase Prediction
Stage-Wise and Prior-Aware Neural Speech Phase Prediction
Fei Liu
Yang Ai
Hui-Peng Du
Ye-Xin Lu
Rui Zheng
Zhen-Hua Ling
30
0
0
07 Oct 2024
Gödel Agent: A Self-Referential Agent Framework for Recursive
  Self-Improvement
Gödel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement
Xunjian Yin
Xinyi Wang
Liangming Pan
Xiaojun Wan
William Yang Wang
LM&Ro
AIFin
AI4CE
LLMAG
22
5
0
06 Oct 2024
AIM 2024 Challenge on Video Super-Resolution Quality Assessment: Methods
  and Results
AIM 2024 Challenge on Video Super-Resolution Quality Assessment: Methods and Results
Ivan Molodetskikh
Artem Borisov
D. Vatolin
Radu Timofte
Jianzhao Liu
...
Xiongkuo Min
Guangtao Zhai
Weihua Luo
Yupeng Z.
Hong Y
SupR
48
6
0
05 Oct 2024
Designing Concise ConvNets with Columnar Stages
Designing Concise ConvNets with Columnar Stages
Ashish Kumar
Jaesik Park
MQ
23
0
0
05 Oct 2024
Deep Nets with Subsampling Layers Unwittingly Discard Useful Activations
  at Test-Time
Deep Nets with Subsampling Layers Unwittingly Discard Useful Activations at Test-Time
Chiao-An Yang
Ziwei Liu
Raymond A. Yeh
23
1
0
01 Oct 2024
FAST GDRNPP: Improving the Speed of State-of-the-Art 6D Object Pose
  Estimation
FAST GDRNPP: Improving the Speed of State-of-the-Art 6D Object Pose Estimation
Thomas Pöllabauer
Ashwin Pramod
Volker Knauthe
Michael Wahl
21
1
0
18 Sep 2024
DACAT: Dual-stream Adaptive Clip-aware Time Modeling for Robust Online
  Surgical Phase Recognition
DACAT: Dual-stream Adaptive Clip-aware Time Modeling for Robust Online Surgical Phase Recognition
Kaixiang Yang
Qiang Li
Zhiwei Wang
29
2
0
10 Sep 2024
VFA: Vision Frequency Analysis of Foundation Models and Human
VFA: Vision Frequency Analysis of Foundation Models and Human
Mohammad Javad Darvishi Bayazi
Md Rifat Arefin
Jocelyn Faubert
Irina Rish
VLM
37
1
0
09 Sep 2024
Improving Robustness of Spectrogram Classifiers with Neural Stochastic
  Differential Equations
Improving Robustness of Spectrogram Classifiers with Neural Stochastic Differential Equations
Joel Brogan
Olivera Kotevska
Anibely Torres
S. Jha
Mark Adams
15
0
0
03 Sep 2024
VQ-Flow: Taming Normalizing Flows for Multi-Class Anomaly Detection via
  Hierarchical Vector Quantization
VQ-Flow: Taming Normalizing Flows for Multi-Class Anomaly Detection via Hierarchical Vector Quantization
Yixuan Zhou
Xing Xu
Zhe Sun
Jingkuan Song
A. Cichocki
Heng Tao Shen
53
1
0
02 Sep 2024
Activation function optimization method: Learnable series linear units
  (LSLUs)
Activation function optimization method: Learnable series linear units (LSLUs)
Chuan Feng
Xi Lin
Shiping Zhu
Hongkang Shi
Maojie Tang
Hua Huang
22
0
0
28 Aug 2024
Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Min Shi
Fuxiao Liu
Shihao Wang
Shijia Liao
Subhashree Radhakrishnan
...
Andrew Tao
Andrew Tao
Zhiding Yu
Guilin Liu
Guilin Liu
MLLM
25
53
0
28 Aug 2024
GenFormer -- Generated Images are All You Need to Improve Robustness of
  Transformers on Small Datasets
GenFormer -- Generated Images are All You Need to Improve Robustness of Transformers on Small Datasets
Sven Oehri
Nikolas Ebert
Ahmed Abdullah
Didier Stricker
Oliver Wasenmüller
ViT
26
5
0
26 Aug 2024
VFM-Det: Towards High-Performance Vehicle Detection via Large Foundation
  Models
VFM-Det: Towards High-Performance Vehicle Detection via Large Foundation Models
Wentao Wu
Fanghua Hong
Xiao Wang
Chenglong Li
Jin Tang
VLM
54
1
0
23 Aug 2024
Previous
1234567
Next