ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.05778
  4. Cited By
InternImage: Exploring Large-Scale Vision Foundation Models with
  Deformable Convolutions

InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions

10 November 2022
Wenhai Wang
Jifeng Dai
Zhe Chen
Zhenhang Huang
Zhiqi Li
Xizhou Zhu
Xiao-hua Hu
Tong Lu
Lewei Lu
Hongsheng Li
Xiaogang Wang
Yu Qiao
    VLM
ArXivPDFHTML

Papers citing "InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions"

50 / 70 papers shown
Title
Perception Encoder: The best visual embeddings are not at the output of the network
Perception Encoder: The best visual embeddings are not at the output of the network
Daniel Bolya
Po-Yao (Bernie) Huang
Peize Sun
Jang Hyun Cho
Andrea Madotto
...
Shiyu Dong
Nikhila Ravi
Daniel Li
Piotr Dollár
Christoph Feichtenhofer
ObjD
VOS
103
0
0
17 Apr 2025
UniHDSA: A Unified Relation Prediction Approach for Hierarchical Document Structure Analysis
UniHDSA: A Unified Relation Prediction Approach for Hierarchical Document Structure Analysis
Jiawei Wang
Kai Hu
Qiang Huo
53
0
0
20 Mar 2025
GIVEPose: Gradual Intra-class Variation Elimination for RGB-based Category-Level Object Pose Estimation
GIVEPose: Gradual Intra-class Variation Elimination for RGB-based Category-Level Object Pose Estimation
Zinqin Huang
Gu Wang
Chenyangguang Zhang
Ruida Zhang
Xiu Li
Xiangyang Ji
46
0
0
19 Mar 2025
OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels
OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels
Meng Lou
Yizhou Yu
110
1
0
27 Feb 2025
Personalized Instance-based Navigation Toward User-Specific Objects in Realistic Environments
Personalized Instance-based Navigation Toward User-Specific Objects in Realistic Environments
Luca Barsellotti
Roberto Bigazzi
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
87
1
0
20 Feb 2025
MoFM: A Large-Scale Human Motion Foundation Model
MoFM: A Large-Scale Human Motion Foundation Model
Mohammadreza Baharani
Ghazal Alinezhad Noghre
Armin Danesh Pazho
Gabriel Maldonado
Hamed Tabkhi
AI4CE
86
0
0
08 Feb 2025
Conformable Convolution for Topologically Aware Learning of Complex Anatomical Structures
Conformable Convolution for Topologically Aware Learning of Complex Anatomical Structures
Yousef Yeganeh
Rui Xiao
Goktug Guvercin
Nassir Navab
Azade Farshad
MedIm
40
0
0
31 Dec 2024
SegMAN: Omni-scale Context Modeling with State Space Models and Local Attention for Semantic Segmentation
SegMAN: Omni-scale Context Modeling with State Space Models and Local Attention for Semantic Segmentation
Yunxiang Fu
Meng Lou
Yizhou Yu
112
1
0
16 Dec 2024
ViPOcc: Leveraging Visual Priors from Vision Foundation Models for Single-View 3D Occupancy Prediction
ViPOcc: Leveraging Visual Priors from Vision Foundation Models for Single-View 3D Occupancy Prediction
Yi Feng
Yu Han
Xijing Zhang
Tanghui Li
Yanting Zhang
Rui Fan
107
3
0
15 Dec 2024
Breaking the Low-Rank Dilemma of Linear Attention
Breaking the Low-Rank Dilemma of Linear Attention
Qihang Fan
Huaibo Huang
Ran He
33
0
0
12 Nov 2024
HRGR: Enhancing Image Manipulation Detection via Hierarchical Region-aware Graph Reasoning
HRGR: Enhancing Image Manipulation Detection via Hierarchical Region-aware Graph Reasoning
Xudong Wang
Y. Li
Huiyu Zhou
Jiaran Zhou
Junyu Dong
37
1
0
29 Oct 2024
Pix2Next: Leveraging Vision Foundation Models for RGB to NIR Image Translation
Pix2Next: Leveraging Vision Foundation Models for RGB to NIR Image Translation
Youngwan Jin
Incheol Park
Hanbin Song
Hyeongjin Ju
Yagiz Nalcakan
Shiho Kim
ViT
20
2
0
25 Sep 2024
Multi-Scale Grouped Prototypes for Interpretable Semantic Segmentation
Multi-Scale Grouped Prototypes for Interpretable Semantic Segmentation
Hugo Porta
Emanuele Dalsasso
Diego Marcos
D. Tuia
93
0
0
14 Sep 2024
ICPR 2024 Competition on Safe Segmentation of Drive Scenes in
  Unstructured Traffic and Adverse Weather Conditions
ICPR 2024 Competition on Safe Segmentation of Drive Scenes in Unstructured Traffic and Adverse Weather Conditions
Furqan Ahmed Shaik
Sandeep Nagar
Aiswarya Maturi
Harshit Kumar Sankhla
Dibyendu Ghosh
Anshuman Majumdar
Srikanth Vidapanakal
Kunal Chaudhary
Sunny Manchanda
Girish Varma
22
0
0
09 Sep 2024
MetaSeg: MetaFormer-based Global Contexts-aware Network for Efficient
  Semantic Segmentation
MetaSeg: MetaFormer-based Global Contexts-aware Network for Efficient Semantic Segmentation
Beoungwoo Kang
Seunghun Moon
Yubin Cho
Hyunwoo Yu
Suk-Ju Kang
ViT
MedIm
24
8
0
14 Aug 2024
DeepGate3: Towards Scalable Circuit Representation Learning
DeepGate3: Towards Scalable Circuit Representation Learning
Zhengyuan Shi
Ziyang Zheng
Sadaf Khan
Jianyuan Zhong
Min Li
Qiang Xu
GNN
AI4CE
29
8
0
15 Jul 2024
SACNet: A Spatially Adaptive Convolution Network for 2D Multi-organ Medical Segmentation
SACNet: A Spatially Adaptive Convolution Network for 2D Multi-organ Medical Segmentation
Lin Zhang
Wenbo Gao
Jie Yi
Yunyun Yang
38
0
0
14 Jul 2024
HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model
HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model
Di Wang
Meiqi Hu
Yao Jin
Yuchun Miao
Jiaqi Yang
...
Lefei Zhang
Chen Wu
Bo Du
Dacheng Tao
Liangpei Zhang
59
21
0
17 Jun 2024
Enhancing Domain Adaptation through Prompt Gradient Alignment
Enhancing Domain Adaptation through Prompt Gradient Alignment
Hoang Phan
Lam C. Tran
Quyen Tran
Trung Le
52
0
0
13 Jun 2024
Enhanced Semantic Segmentation Pipeline for WeatherProof Dataset
  Challenge
Enhanced Semantic Segmentation Pipeline for WeatherProof Dataset Challenge
Nan Zhang
Xidan Zhang
Jianing Wei
Fangjun Wang
Zhiming Tan
MDE
22
0
0
06 Jun 2024
YotoR-You Only Transform One Representation
YotoR-You Only Transform One Representation
José Ignacio Díaz Villa
P. Loncomilla
Javier Ruiz-del-Solar
ViT
22
0
0
30 May 2024
Accelerating Transformers with Spectrum-Preserving Token Merging
Accelerating Transformers with Spectrum-Preserving Token Merging
Hoai-Chau Tran
D. M. Nguyen
Duy M. Nguyen
Trung Thanh Nguyen
Ngan Le
Pengtao Xie
Daniel Sonntag
James Y. Zou
Binh T. Nguyen
Mathias Niepert
32
8
0
25 May 2024
Vision Transformer with Sparse Scan Prior
Vision Transformer with Sparse Scan Prior
Qihang Fan
Huaibo Huang
Mingrui Chen
Ran He
ViT
36
5
0
22 May 2024
Talk2Radar: Bridging Natural Language with 4D mmWave Radar for 3D Referring Expression Comprehension
Talk2Radar: Bridging Natural Language with 4D mmWave Radar for 3D Referring Expression Comprehension
Runwei Guan
Ruixiao Zhang
Ningwei Ouyang
Jianan Liu
Ka Lok Man
...
Ming Xu
Jeremy S. Smith
Eng Gee Lim
Yutao Yue
Hui Xiong
46
8
0
21 May 2024
LyS at SemEval-2024 Task 3: An Early Prototype for End-to-End Multimodal
  Emotion Linking as Graph-Based Parsing
LyS at SemEval-2024 Task 3: An Early Prototype for End-to-End Multimodal Emotion Linking as Graph-Based Parsing
Ana Ezquerro
David Vilares
34
1
0
10 May 2024
Automatic Defect Detection in Sewer Network Using Deep Learning Based
  Object Detector
Automatic Defect Detection in Sewer Network Using Deep Learning Based Object Detector
Bach Ha
Birgit Schalter
Laura White
J. Köhler
ObjD
AI4CE
19
2
0
09 Apr 2024
VSRD: Instance-Aware Volumetric Silhouette Rendering for Weakly
  Supervised 3D Object Detection
VSRD: Instance-Aware Volumetric Silhouette Rendering for Weakly Supervised 3D Object Detection
Zihua Liu
Hiroki Sakuma
Masatoshi Okutomi
30
2
0
29 Mar 2024
PARMESAN: Parameter-Free Memory Search and Transduction for Dense Prediction Tasks
PARMESAN: Parameter-Free Memory Search and Transduction for Dense Prediction Tasks
Philip Matthias Winter
M. Wimmer
David Major
Dimitrios Lenis
Astrid Berg
Theresa Neubauer
Gaia Romana De Paolis
Johannes Novotny
Sophia Ulonska
Katja Bühler
34
0
0
18 Mar 2024
Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures
Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures
Yuchen Duan
Weiyun Wang
Zhe Chen
Xizhou Zhu
Lewei Lu
Tong Lu
Yu Qiao
Hongsheng Li
Jifeng Dai
Wenhai Wang
ViT
38
44
0
04 Mar 2024
Enhancing Small Object Encoding in Deep Neural Networks: Introducing
  Fast&Focused-Net with Volume-wise Dot Product Layer
Enhancing Small Object Encoding in Deep Neural Networks: Introducing Fast&Focused-Net with Volume-wise Dot Product Layer
Tofik Ali
Partha Pratim Roy
ObjD
19
2
0
18 Jan 2024
Latency-aware Road Anomaly Segmentation in Videos: A Photorealistic
  Dataset and New Metrics
Latency-aware Road Anomaly Segmentation in Videos: A Photorealistic Dataset and New Metrics
Beiwen Tian
Huan-ang Gao
Leiyao Cui
Yupeng Zheng
Lan Luo
Baofeng Wang
Rong Zhi
Guyue Zhou
Hao Zhao
19
4
0
10 Jan 2024
UniChest: Conquer-and-Divide Pre-training for Multi-Source Chest X-Ray
  Classification
UniChest: Conquer-and-Divide Pre-training for Multi-Source Chest X-Ray Classification
Tianjie Dai
Ruipeng Zhang
Feng Hong
Jiangchao Yao
Ya-Qin Zhang
Yanfeng Wang
17
8
0
18 Dec 2023
A Graph-Based Approach for Category-Agnostic Pose Estimation
A Graph-Based Approach for Category-Agnostic Pose Estimation
Or Hirschorn
S. Avidan
21
10
0
29 Nov 2023
IDD-AW: A Benchmark for Safe and Robust Segmentation of Drive Scenes in
  Unstructured Traffic and Adverse Weather
IDD-AW: A Benchmark for Safe and Robust Segmentation of Drive Scenes in Unstructured Traffic and Adverse Weather
Furqan Ahmed Shaik
Abhishek Malreddy
Nikhil Reddy Billa
Kunal Chaudhary
Sunny Manchanda
Girish Varma
7
11
0
24 Nov 2023
LABELMAKER: Automatic Semantic Label Generation from RGB-D Trajectories
LABELMAKER: Automatic Semantic Label Generation from RGB-D Trajectories
Silvan Weder
Hermann Blum
Francis Engelmann
Marc Pollefeys
VLM
19
11
0
20 Nov 2023
OmniVec: Learning robust representations with cross modal sharing
OmniVec: Learning robust representations with cross modal sharing
Siddharth Srivastava
Gaurav Sharma
SSL
21
64
0
07 Nov 2023
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
Meng Lou
Hong-Yu Zhou
Sibei Yang
Yizhou Yu
Chuan Wu
Yizhou Yu
ViT
31
35
0
30 Oct 2023
Minimalist and High-Performance Semantic Segmentation with Plain Vision
  Transformers
Minimalist and High-Performance Semantic Segmentation with Plain Vision Transformers
Yuanduo Hong
Jue Wang
Weichao Sun
Huihui Pan
VLM
ViT
32
7
0
19 Oct 2023
Unifying Image Processing as Visual Prompting Question Answering
Unifying Image Processing as Visual Prompting Question Answering
Yihao Liu
Xiangyu Chen
Xianzheng Ma
Xintao Wang
Jiantao Zhou
Yu Qiao
Chao Dong
MLLM
22
18
0
16 Oct 2023
Large Models for Time Series and Spatio-Temporal Data: A Survey and
  Outlook
Large Models for Time Series and Spatio-Temporal Data: A Survey and Outlook
Ming Jin
Qingsong Wen
Yuxuan Liang
Chaoli Zhang
Siqiao Xue
...
Shirui Pan
Vincent S. Tseng
Yu Zheng
Lei Chen
Hui Xiong
AI4TS
SyDa
31
116
0
16 Oct 2023
PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm
PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm
Haoyi Zhu
Honghui Yang
Xiaoyang Wu
Di Huang
Sha Zhang
...
Hengshuang Zhao
Chunhua Shen
Yu Qiao
Tong He
Wanli Ouyang
SSL
69
42
0
12 Oct 2023
The Robust Semantic Segmentation UNCV2023 Challenge Results
The Robust Semantic Segmentation UNCV2023 Challenge Results
Xuanlong Yu
Yi Zuo
Zitao Wang
Xiaowen Zhang
Jiaxuan Zhao
...
Angela Yao
Wenlong Chen
Ivor J. A. Simpson
Neill D. F. Campbell
Gianni Franchi
UQCV
28
4
0
27 Sep 2023
DETR Doesn't Need Multi-Scale or Locality Design
DETR Doesn't Need Multi-Scale or Locality Design
Yutong Lin
Yuhui Yuan
Zheng-Wei Zhang
Chen Li
Nanning Zheng
Han Hu
25
5
0
03 Aug 2023
PPI-NET: End-to-End Parametric Primitive Inference
PPI-NET: End-to-End Parametric Primitive Inference
Liang Wang
Xiaogang Wang
22
1
0
03 Aug 2023
Tracking Anything in High Quality
Tracking Anything in High Quality
Jiawen Zhu
Zhe Chen
Zeqi Hao
Shijie Chang
Lu Zhang
...
Bin Luo
Ju He
Jinpeng Lan
Hanyuan Chen
Chenyang Li
VOS
13
7
0
26 Jul 2023
MVA2023 Small Object Detection Challenge for Spotting Birds: Dataset,
  Methods, and Results
MVA2023 Small Object Detection Challenge for Spotting Birds: Dataset, Methods, and Results
Yuki Kondo
Norimichi Ukita
Takayuki Yamaguchi
Haoran Hou
Mu-Yi Shen
...
Ichiro Ide
Yosuke Shinya
Xinyao Liu
Guang Liang
S. Yasui
23
13
0
18 Jul 2023
FB-OCC: 3D Occupancy Prediction based on Forward-Backward View
  Transformation
FB-OCC: 3D Occupancy Prediction based on Forward-Backward View Transformation
Zhiqi Li
Zhiding Yu
David Austin
Mingsheng Fang
Shiyi Lan
Jan Kautz
J. Álvarez
13
98
0
04 Jul 2023
Parameter-efficient is not sufficient: Exploring Parameter, Memory, and
  Time Efficient Adapter Tuning for Dense Predictions
Parameter-efficient is not sufficient: Exploring Parameter, Memory, and Time Efficient Adapter Tuning for Dense Predictions
Dongshuo Yin
Xueting Han
Bin Li
Hao Feng
Jinghua Bai
VPVLM
26
16
0
16 Jun 2023
Semantic Segmentation on VSPW Dataset through Contrastive Loss and
  Multi-dataset Training Approach
Semantic Segmentation on VSPW Dataset through Contrastive Loss and Multi-dataset Training Approach
Min Yan
Qianxiong Ning
Qian Wang
17
1
0
06 Jun 2023
ONE-PEACE: Exploring One General Representation Model Toward Unlimited
  Modalities
ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
Peng Wang
Shijie Wang
Junyang Lin
Shuai Bai
Xiaohuan Zhou
Jingren Zhou
Xinggang Wang
Chang Zhou
VLM
MLLM
ObjD
16
114
0
18 May 2023
12
Next