Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2201.03545
Cited By
A ConvNet for the 2020s
10 January 2022
Zhuang Liu
Hanzi Mao
Chaozheng Wu
Christoph Feichtenhofer
Trevor Darrell
Saining Xie
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A ConvNet for the 2020s"
50 / 2,184 papers shown
Title
Novel Pooling-based VGG-Lite for Pneumonia and Covid-19 Detection from Imbalanced Chest X-Ray Datasets
Santanu Roy
Ashvath Suresh
Palak Sahu
Tulika Rudra Gupta
24
0
0
10 Apr 2025
FANeRV: Frequency Separation and Augmentation based Neural Representation for Video
Li Yu
Zhihui Li
Chao Yao
Jimin Xiao
M. Gabbouj
30
0
0
09 Apr 2025
On the Importance of Conditioning for Privacy-Preserving Data Augmentation
Julian Lorenz
K. Ludwig
Valentin Haug
Rainer Lienhart
DiffM
36
0
0
08 Apr 2025
D-Feat Occlusions: Diffusion Features for Robustness to Partial Visual Occlusions in Object Recognition
Rupayan Mallick
Sibo Dong
Nataniel Ruiz
Sarah Adel Bargal
DiffM
42
0
0
08 Apr 2025
DefMamba: Deformable Visual State Space Model
Leiye Liu
Miao Zhang
Jihao Yin
Tingwei Liu
Wei Ji
Yongri Piao
Huchuan Lu
Mamba
55
0
0
08 Apr 2025
Contour Integration Underlies Human-Like Vision
Ben Lonnqvist
Elsa Scialom
Abdülkadir Gökce
Zehra Merchant
Michael H. Herzog
Martin Schrimpf
VLM
28
0
0
07 Apr 2025
DFormerv2: Geometry Self-Attention for RGBD Semantic Segmentation
Bo Yin
Jiao-Long Cao
Ming-Ming Cheng
Qibin Hou
3DPC
MDE
48
0
0
07 Apr 2025
Content-Aware Transformer for All-in-one Image Restoration
Gang Wu
Junjun Jiang
Kui Jiang
Xianming Liu
ViT
50
0
0
07 Apr 2025
One Quantizer is Enough: Toward a Lightweight Audio Codec
Linwei Zhai
H. Ding
Cui Zhao
Fei-Yue Wang
Ge Wang
Wang Zhi
Wei Xi
MQ
27
0
0
07 Apr 2025
LEO-MINI: An Efficient Multimodal Large Language Model using Conditional Token Reduction and Mixture of Multi-Modal Experts
Yimu Wang
Mozhgan Nasr Azadani
Sean Sedwards
Krzysztof Czarnecki
MLLM
MoE
52
0
0
07 Apr 2025
EMF: Event Meta Formers for Event-based Real-time Traffic Object Detection
Muhammad Ahmed Ullah Khan
Abdul Hannan Khan
Andreas Dengel
33
0
0
05 Apr 2025
Accurate GPU Memory Prediction for Deep Learning Jobs through Dynamic Analysis
Jiabo Shi
Yehia Elkhatib
3DH
VLM
25
0
0
04 Apr 2025
Hierarchical Modeling for Medical Visual Question Answering with Cross-Attention Fusion
Junkai Zhang
Bin Li
Shoujun Zhou
Yue Du
36
0
0
04 Apr 2025
GMR-Conv: An Efficient Rotation and Reflection Equivariant Convolution Kernel Using Gaussian Mixture Rings
Yuexi Du
Jiazhen Zhang
Nicha Dvornek
J. Onofrey
AAML
40
0
0
03 Apr 2025
Noise-Aware Generalization: Robustness to In-Domain Noise and Out-of-Domain Generalization
Siqi Wang
Aoming Liu
Bryan A. Plummer
OOD
36
0
0
03 Apr 2025
HGFormer: Topology-Aware Vision Transformer with HyperGraph Learning
Hao Wang
Shuo Zhang
Biao Leng
ViT
59
0
0
03 Apr 2025
DALIP: Distribution Alignment-based Language-Image Pre-Training for Domain-Specific Data
Junjie Wu
Jiangtao Xie
Zhaolin Zhang
Qilong Wang
Q. Hu
P. Li
Sen Xu
VLM
36
0
0
02 Apr 2025
All Patches Matter, More Patches Better: Enhance AI-Generated Image Detection via Panoptic Patch Learning
Zheng Yang
Ruoxin Chen
Zhiyuan Yan
Ke-Yue Zhang
Xinghe Fu
...
Xiujun Shu
Taiping Yao
Junchi Yan
Shouhong Ding
Xi Li
29
0
0
02 Apr 2025
Slow-Fast Architecture for Video Multi-Modal Large Language Models
Min Shi
Shihao Wang
Chieh-Yun Chen
Jitesh Jain
Kai Wang
Junjun Xiong
Guilin Liu
Zhiding Yu
Humphrey Shi
31
1
0
02 Apr 2025
Scaling Language-Free Visual Representation Learning
David Fan
Shengbang Tong
Jiachen Zhu
Koustuv Sinha
Zhuang Liu
...
Michael G. Rabbat
Nicolas Ballas
Yann LeCun
Amir Bar
Saining Xie
CLIP
VLM
56
2
0
01 Apr 2025
Spingarn's Method and Progressive Decoupling Beyond Elicitable Monotonicity
B. Evens
P. Latafat
Panagiotis Patrinos
46
0
0
01 Apr 2025
Spectral-Adaptive Modulation Networks for Visual Perception
Guhnoo Yun
J. Yoo
Kijung Kim
Jeongho Lee
Paul Hongsuck Seo
Dong Hwan Kim
32
0
0
31 Mar 2025
Consistency-aware Self-Training for Iterative-based Stereo Matching
Jingyi Zhou
Peng Ye
H. Zhang
Jiakang Yuan
Rao Qiang
Liu YangChenXu
Wu Cailin
Feng Xu
Tao Chen
3DV
44
0
0
31 Mar 2025
Video-based Traffic Light Recognition by Rockchip RV1126 for Autonomous Driving
Miao Fan
Xuxu Kong
Shengtong Xu
Haoyi Xiong
Xiangzeng Liu
ViT
36
0
0
31 Mar 2025
KernelDNA: Dynamic Kernel Sharing via Decoupled Naive Adapters
Haiduo Huang
Yadong Zhang
Pengju Ren
47
0
0
30 Mar 2025
LSNet: See Large, Focus Small
Ao Wang
Hui Chen
Zijia Lin
J. Han
Guiguang Ding
37
0
0
29 Mar 2025
SupertonicTTS: Towards Highly Scalable and Efficient Text-to-Speech System
H. Kim
Jinhyeok Yang
Yechan Yu
Seunghun Ji
Jacob Morton
Frederik Bous
Joon Byun
Juheon Lee
46
0
0
29 Mar 2025
A Survey on Remote Sensing Foundation Models: From Vision to Multimodality
Ziyue Huang
Hongxi Yan
Qiqi Zhan
Shuai Yang
Mingming Zhang
Chenkai Zhang
Yiming Lei
Zeming Liu
Qingjie Liu
Y. Wang
42
0
0
28 Mar 2025
VITAL: More Understandable Feature Visualization through Distribution Alignment and Relevant Information Flow
Ada Gorgun
Bernt Schiele
Jonas Fischer
29
0
0
28 Mar 2025
Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
Size Wu
W. Zhang
Lumin Xu
Sheng Jin
Zhonghua Wu
Qingyi Tao
Wentao Liu
Wei Li
Chen Change Loy
VGen
64
2
0
27 Mar 2025
vGamba: Attentive State Space Bottleneck for efficient Long-range Dependencies in Visual Recognition
Yunusa Haruna
A. Lawan
Mamba
47
0
0
27 Mar 2025
Evaluating Facial Expression Recognition Datasets for Deep Learning: A Benchmark Study with Novel Similarity Metrics
F. X. Gaya-Morey
Cristina Manresa-Yee
Célia Martinie
Jose Maria Buades Rubio
64
0
0
26 Mar 2025
Mamba-3D as Masked Autoencoders for Accurate and Data-Efficient Analysis of Medical Ultrasound Videos
Jiaheng Zhou
Yanfeng Zhou
Wei Fang
Yuxing Tang
Le Lu
Ge Yang
Mamba
129
0
0
26 Mar 2025
Hierarchical Label Propagation: A Model-Size-Dependent Performance Booster for AudioSet Tagging
Ludovic Tuncay
Etienne Labbé
Thomas Pellegrini
VLM
30
0
0
26 Mar 2025
Bandwidth Allocation for Cloud-Augmented Autonomous Driving
Peter Schafhalter
Alexander Krentsel
Joseph E. Gonzalez
Sylvia Ratnasamy
S. Shenker
Ion Stoica
74
0
0
26 Mar 2025
SChanger: Change Detection from a Semantic Change and Spatial Consistency Perspective
Ziyu Zhou
Keyan Hu
Yutian Fang
Xiaoping Rui
75
0
0
26 Mar 2025
ChA-MAEViT: Unifying Channel-Aware Masked Autoencoders and Multi-Channel Vision Transformers for Improved Cross-Channel Learning
Chau Pham
Juan C. Caicedo
Bryan A. Plummer
42
0
0
25 Mar 2025
Tracktention: Leveraging Point Tracking to Attend Videos Faster and Better
Zihang Lai
Andrea Vedaldi
34
0
0
25 Mar 2025
Optimizing Breast Cancer Detection in Mammograms: A Comprehensive Study of Transfer Learning, Resolution Reduction, and Multi-View Classification
D. Petrini
Hae Yong Kim
41
0
0
25 Mar 2025
Scaling Vision Pre-Training to 4K Resolution
Baifeng Shi
Boyi Li
Han Cai
Y. Lu
Sifei Liu
...
Jan Kautz
Song Han
Trevor Darrell
Pavlo Molchanov
Hongxu Yin
CLIP
56
0
0
25 Mar 2025
Surg-3M: A Dataset and Foundation Model for Perception in Surgical Settings
Chengan Che
Chao Wang
Tom Vercauteren
Sophia Tsoka
Luis C. García-Peraza-Herrera
MedIm
38
0
0
25 Mar 2025
Adaptive Wavelet Filters as Practical Texture Feature Amplifiers for Parkinson's Disease Screening in OCT
X. Zhang
Hanfeng Shi
X. Li
Haili Ye
Tao Xu
Na Li
Yan Hu
Fan Lv
J. Chen
Jiang Liu
42
0
0
25 Mar 2025
Frequency Dynamic Convolution for Dense Image Prediction
Linwei Chen
Lin Gu
Liang Li
C. Yan
Ying Fu
37
0
0
24 Mar 2025
Image-to-Text for Medical Reports Using Adaptive Co-Attention and Triple-LSTM Module
Yishen Liu
Shengda Liu
Hudan Pan
MedIm
45
0
0
24 Mar 2025
Your ViT is Secretly an Image Segmentation Model
Tommie Kerssies
Niccolò Cavagnero
Alexander Hermans
Narges Norouzi
Giuseppe Averta
Bastian Leibe
Gijs Dubbelman
Daan de Geus
ViT
VLM
59
1
0
24 Mar 2025
CO-SPY: Combining Semantic and Pixel Features to Detect Synthetic Images by AI
Siyuan Cheng
Lingjuan Lyu
Zhenting Wang
X. Zhang
Vikash Sehwag
40
0
0
24 Mar 2025
Efficient Deep Learning Approaches for Processing Ultra-Widefield Retinal Imaging
Siwon Kim
Wooyung Yun
Jeongbin Oh
Soomok Lee
MedIm
44
0
0
23 Mar 2025
CustomKD: Customizing Large Vision Foundation for Edge Model Improvement via Knowledge Distillation
Jungsoo Lee
Debasmit Das
Munawar Hayat
Sungha Choi
Kyuwoong Hwang
Fatih Porikli
VLM
63
0
0
23 Mar 2025
Co-op: Correspondence-based Novel Object Pose Estimation
Sungphill Moon
Hyeontae Son
Dongcheol Hur
Sangwook Kim
3DH
59
1
0
22 Mar 2025
Restoring Forgotten Knowledge in Non-Exemplar Class Incremental Learning through Test-Time Semantic Evolution
Haori Lu
Xusheng Cao
Linlan Huang
Enguang Wang
Fei Yang
Xialei Liu
CLL
47
0
0
21 Mar 2025
Previous
1
2
3
4
5
...
42
43
44
Next