Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.11926
Cited By
Focal Modulation Networks
22 March 2022
Jianwei Yang
Chunyuan Li
Xiyang Dai
Lu Yuan
Jianfeng Gao
3DPC
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Focal Modulation Networks"
50 / 141 papers shown
Title
Efficient Video Object Segmentation via Modulated Cross-Attention Memory
Abdelrahman M. Shaker
Syed Talal Wasim
Martin Danelljan
Salman Khan
Ming-Hsuan Yang
Fahad Shahbaz Khan
VOS
18
3
0
26 Mar 2024
Surgical-LVLM: Learning to Adapt Large Vision-Language Model for Grounded Visual Question Answering in Robotic Surgery
Guan-Feng Wang
Long Bai
Wan Jun Nah
Jie Wang
Zhaoxi Zhang
Zhen Chen
Jinlin Wu
Mobarakol Islam
Hongbin Liu
Hongliang Ren
40
14
0
22 Mar 2024
Multi-Scale Implicit Transformer with Re-parameterize for Arbitrary-Scale Super-Resolution
Jinchen Zhu
Mingjian Zhang
Ling Zheng
Shizhuang Weng
27
0
0
11 Mar 2024
Rethinking Transformers Pre-training for Multi-Spectral Satellite Imagery
Mubashir Noman
Muzammal Naseer
Hisham Cholakkal
Rao Muhammad Anwar
Salman Khan
Fahad Shahbaz Khan
ViT
28
34
0
08 Mar 2024
Frequency-Adaptive Dilated Convolution for Semantic Segmentation
Linwei Chen
Lin Gu
Ying Fu
16
21
0
08 Mar 2024
A new method for optical steel rope non-destructive damage detection
Yunqing Bao
Bin Hu
14
0
0
06 Feb 2024
Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective
Wu Lin
Felix Dangel
Runa Eschenhagen
Juhan Bae
Richard E. Turner
Alireza Makhzani
ODL
44
12
0
05 Feb 2024
Architecture Analysis and Benchmarking of 3D U-shaped Deep Learning Models for Thoracic Anatomical Segmentation
Arash Harirpoush
Amir Rasoulian
Marta Kersten-Oertel
Yiming Xiao
3DV
6
0
0
05 Feb 2024
Focal Modulation Networks for Interpretable Sound Classification
Luca Della Libera
Cem Subakan
Mirco Ravanelli
28
2
0
05 Feb 2024
SHViT: Single-Head Vision Transformer with Memory Efficient Macro Design
Seokju Yun
Youngmin Ro
ViT
26
29
0
29 Jan 2024
Learning to Prompt Segment Anything Models
Jiaxing Huang
Kai Jiang
Jingyi Zhang
Han Qiu
Lewei Lu
Shijian Lu
Eric P. Xing
VLM
LRM
28
7
0
09 Jan 2024
Perception Test 2023: A Summary of the First Challenge And Outcome
Joseph Heyward
João Carreira
Dima Damen
Andrew Zisserman
Viorica Patraucean
14
0
0
20 Dec 2023
Visual Encoders for Data-Efficient Imitation Learning in Modern Video Games
Lukas Schäfer
Logan Jones
Anssi Kanervisto
Yuhan Cao
Tabish Rashid
Raluca Georgescu
David Bignell
Siddhartha Sen
Andrea Trevino Gavito
Sam Devlin
82
3
0
04 Dec 2023
Infrared Image Super-Resolution via GAN
Y. Huang
S. Omachi
GAN
19
0
0
01 Dec 2023
TransNeXt: Robust Foveal Visual Perception for Vision Transformers
Dai Shi
ViT
11
72
0
28 Nov 2023
Stable Segment Anything Model
Qi Fan
Xin Tao
Lei Ke
Mingqiao Ye
Yuanhui Zhang
Pengfei Wan
Zhong-ming Wang
Yu-Wing Tai
Chi-Keung Tang
VLM
15
6
0
27 Nov 2023
Segment (Almost) Nothing: Prompt-Agnostic Adversarial Attacks on Segmentation Models
Francesco Croce
Matthias Hein
VLM
17
3
0
24 Nov 2023
Hardware Resilience Properties of Text-Guided Image Classifiers
Syed Talal Wasim
Kabila Haile Soboka
Abdulrahman Mahmoud
Salman Khan
David Brooks
Gu-Yeon Wei
VLM
12
1
0
23 Nov 2023
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
Bin Xiao
Haiping Wu
Weijian Xu
Xiyang Dai
Houdong Hu
Yumao Lu
Michael Zeng
Ce Liu
Lu Yuan
VLM
31
140
0
10 Nov 2023
Are Natural Domain Foundation Models Useful for Medical Image Classification?
Joana Palés Huix
Adithya Raju Ganeshan
Johan Fredin Haslum
Magnus P Soderberg
Christos Matsoukas
Kevin Smith
OOD
MedIm
VLM
11
27
0
30 Oct 2023
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
Meng Lou
Hong-Yu Zhou
Sibei Yang
Yizhou Yu
Chuan Wu
Yizhou Yu
ViT
31
32
0
30 Oct 2023
Learning from Rich Semantics and Coarse Locations for Long-tailed Object Detection
Lingchen Meng
Xiyang Dai
Jianwei Yang
Dongdong Chen
Yinpeng Chen
Mengchen Liu
Yi-Ling Chen
Zuxuan Wu
Lu Yuan
Yu-Gang Jiang
8
6
0
18 Oct 2023
HydraViT: Adaptive Multi-Branch Transformer for Multi-Label Disease Classification from Chest X-ray Images
Şaban Öztürk
M. Y. Turali
Tolga Çukur
MedIm
ViT
17
10
0
09 Oct 2023
Low-Resolution Self-Attention for Semantic Segmentation
Yu-Huan Wu
Shi-Chen Zhang
Yun-Hai Liu
Le Zhang
Xin Zhan
Daquan Zhou
Jiashi Feng
Ming-Ming Cheng
Liangli Zhen
ViT
32
3
0
08 Oct 2023
Cell Tracking-by-detection using Elliptical Bounding Boxes
Lucas N. Kirsten
Cláudio R. Jung
11
1
0
07 Oct 2023
RMT: Retentive Networks Meet Vision Transformers
Qihang Fan
Huaibo Huang
Mingrui Chen
Hongmin Liu
Ran He
ViT
30
65
0
20 Sep 2023
DAT++: Spatially Dynamic Vision Transformer with Deformable Attention
Zhuofan Xia
Xuran Pan
Shiji Song
Li Erran Li
Gao Huang
ViT
19
22
0
04 Sep 2023
ACC-UNet: A Completely Convolutional UNet model for the 2020s
Nabil Ibtehaz
Daisuke Kihara
MedIm
OOD
ViT
11
40
0
25 Aug 2023
LISTER: Neighbor Decoding for Length-Insensitive Scene Text Recognition
Changxu Cheng
P. Wang
Cheng Da
Qi Zheng
Cong Yao
23
15
0
24 Aug 2023
SPANet: Frequency-balancing Token Mixer using Spectral Pooling Aggregation Modulation
Guhnoo Yun
J. Yoo
Kijung Kim
Jeongho Lee
Dong Hwan Kim
MoE
11
8
0
22 Aug 2023
Which Transformer to Favor: A Comparative Analysis of Efficiency in Vision Transformers
Tobias Christian Nauen
Sebastián M. Palacio
Federico Raue
Andreas Dengel
35
3
0
18 Aug 2023
High-Fidelity Lake Extraction via Two-Stage Prompt Enhancement: Establishing a Novel Baseline and Benchmark
Ben Chen
Xuechao Zou
K. Li
Yu-an Zhang
Junliang Xing
Pin Tao
15
1
0
16 Aug 2023
Deformable Mixer Transformer with Gating for Multi-Task Learning of Dense Prediction
Yangyang Xu
Yibo Yang
Bernard Ghanemm
L. Zhang
Du Bo
Dacheng Tao
8
1
0
10 Aug 2023
A Unified Interactive Model Evaluation for Classification, Object Detection, and Instance Segmentation in Computer Vision
Changjian Chen
Yukai Guo
Fengyuan Tian
Siyi Liu
Weikai Yang
Zhao-Ming Wang
Jing Wu
Hang Su
Hanspeter Pfister
Shixia Liu
20
13
0
09 Aug 2023
Weakly supervised segmentation of intracranial aneurysms using a novel 3D focal modulation UNet
Amir Rasoulian
Arash Harirpoush
Soorena Salari
Yiming Xiao
16
1
0
06 Aug 2023
FocalErrorNet: Uncertainty-aware focal modulation network for inter-modal registration error estimation in ultrasound-guided neurosurgery
Soorena Salari
Amir Rasoulian
H. Rivaz
Yiming Xiao
12
6
0
26 Jul 2023
Foundational Models Defining a New Era in Vision: A Survey and Outlook
Muhammad Awais
Muzammal Naseer
Salman Khan
Rao Muhammad Anwer
Hisham Cholakkal
M. Shah
Ming Yang
F. Khan
VLM
13
116
0
25 Jul 2023
Multi-Modal Machine Learning for Assessing Gaming Skills in Online Streaming: A Case Study with CS:GO
Longxiang Zhang
Wenping Wang
29
1
0
23 Jul 2023
Scale-Aware Modulation Meet Transformer
Wei-Shiang Lin
Ziheng Wu
Jiayu Chen
Jun Huang
Lianwen Jin
MoE
ViT
12
65
0
17 Jul 2023
Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition
Syed Talal Wasim
Muhammad Uzair Khattak
Muzammal Naseer
Salman Khan
M. Shah
F. Khan
ViT
46
12
0
13 Jul 2023
Parameter-efficient is not sufficient: Exploring Parameter, Memory, and Time Efficient Adapter Tuning for Dense Predictions
Dongshuo Yin
Xueting Han
Bin Li
Hao Feng
Jinghua Bai
VPVLM
26
16
0
16 Jun 2023
Robustness Analysis on Foundational Segmentation Models
Madeline Chantry Schiappa
Shehreen Azad
V. Sachidanand
Yunhao Ge
O. Mikšík
Y. S. Rawat
Vibhav Vineet
OOD
VLM
AAML
9
5
0
15 Jun 2023
Quantitative Analysis of Primary Attribution Explainable Artificial Intelligence Methods for Remote Sensing Image Classification
Akshatha Mohan
Joshua Peeples
9
4
0
06 Jun 2023
Recognize Anything: A Strong Image Tagging Model
Youcai Zhang
Xinyu Huang
Jinyu Ma
Zhaoyang Li
Zhaochuan Luo
...
Tong Luo
Yaqian Li
Siyi Liu
Yandong Guo
Lei Zhang
VLM
8
124
0
06 Jun 2023
Semantic Segmentation on VSPW Dataset through Contrastive Loss and Multi-dataset Training Approach
Min Yan
Qianxiong Ning
Qian Wang
12
1
0
06 Jun 2023
Lightweight Vision Transformer with Bidirectional Interaction
Qihang Fan
Huaibo Huang
Xiaoqiang Zhou
Ran He
ViT
25
27
0
01 Jun 2023
Image Quality Is Not All You Want: Task-Driven Lens Design for Image Classification
Xinge Yang
Qiang Fu
Yunfeng Nie
Wolfgang Heidrich
VLM
13
7
0
26 May 2023
ICDAR 2023 Competition on Robust Layout Segmentation in Corporate Documents
Christoph Auer
A. Nassar
Maksym Lysak
Michele Dolfi
Nikolaos Livathinos
Peter W. J. Staar
OOD
3DV
14
6
0
24 May 2023
VideoLLM: Modeling Video Sequence with Large Language Models
Guo Chen
Yin-Dong Zheng
Jiahao Wang
Jilan Xu
Yifei Huang
...
Yi Wang
Yali Wang
Yu Qiao
Tong Lu
Limin Wang
MLLM
92
76
0
22 May 2023
Hausdorff Distance Matching with Adaptive Query Denoising for Rotated Detection Transformer
Hakjin Lee
Minki Song
Jamyoung Koo
Junghoon Seo
30
7
0
12 May 2023
Previous
1
2
3
Next