ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.02643
  4. Cited By
Segment Anything

Segment Anything

5 April 2023
A. Kirillov
Eric Mintun
Nikhila Ravi
Hanzi Mao
Chloe Rolland
Laura Gustafson
Tete Xiao
Spencer Whitehead
Alexander C. Berg
Wan-Yen Lo
Piotr Dollár
Ross B. Girshick
    MLLM
    VLM
ArXivPDFHTML

Papers citing "Segment Anything"

50 / 4,200 papers shown
Title
On Moving Object Segmentation from Monocular Video with Transformers
On Moving Object Segmentation from Monocular Video with Transformers
Christian Homeyer
Christoph Schnörr
107
3
0
28 Nov 2024
COMPrompter: reconceptualized segment anything model with multiprompt
  network for camouflaged object detection
COMPrompter: reconceptualized segment anything model with multiprompt network for camouflaged object detection
Xiaoqin Zhang
Zhenni Yu
Li Zhao
Deng-Ping Fan
Guobao Xiao
VLM
76
0
0
28 Nov 2024
InstanceGaussian: Appearance-Semantic Joint Gaussian Representation for 3D Instance-Level Perception
InstanceGaussian: Appearance-Semantic Joint Gaussian Representation for 3D Instance-Level Perception
Haijie Li
Y. Wu
Jiarui Meng
Qiankun Gao
Zhiyao Zhang
Ronggang Wang
Jian Zhang
ISeg
91
2
0
28 Nov 2024
Grid-augmented vision: A simple yet effective approach for enhanced
  spatial understanding in multi-modal agents
Grid-augmented vision: A simple yet effective approach for enhanced spatial understanding in multi-modal agents
Joongwon Chae
Zhenyu Wang
Peiwu Qin
Dongmei Yu
Peiwu Qin
66
0
0
27 Nov 2024
Enhancing Visual Reasoning with Autonomous Imagination in Multimodal
  Large Language Models
Enhancing Visual Reasoning with Autonomous Imagination in Multimodal Large Language Models
Jiaheng Liu
Yumeng Li
Boyuan Xiao
Yichang Jian
Ziang Qin
Tianjia Shao
Yao-Xiang Ding
Kun Zhou
MLLM
LRM
105
4
0
27 Nov 2024
ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
Qing Jiang
Gen Luo
Yuqin Yang
Yuda Xiong
Yihao Chen
Zhaoyang Zeng
Tianhe Ren
Lei Zhang
VLM
LRM
109
7
0
27 Nov 2024
Rapid Distributed Fine-tuning of a Segmentation Model Onboard Satellites
Rapid Distributed Fine-tuning of a Segmentation Model Onboard Satellites
Meghan Plumridge
Rasmus Maråk
Chiara Ceccobello
Pablo Gómez
Gabriele Meoni
F. Svoboda
Nicholas D. Lane
68
0
0
26 Nov 2024
HyperSeg: Towards Universal Visual Segmentation with Large Language
  Model
HyperSeg: Towards Universal Visual Segmentation with Large Language Model
Cong Wei
Yujie Zhong
Haoxian Tan
Yong Liu
Zheng Zhao
Jie Hu
Yujiu Yang
VOS
MLLM
VLM
LRM
91
1
0
26 Nov 2024
Distractor-free Generalizable 3D Gaussian Splatting
Distractor-free Generalizable 3D Gaussian Splatting
Yanqi Bao
Jing Liao
Jing Huo
Yang Gao
3DGS
97
1
0
26 Nov 2024
A Distractor-Aware Memory for Visual Object Tracking with SAM2
A Distractor-Aware Memory for Visual Object Tracking with SAM2
Jovana Videnovic
A. Lukežič
Matej Kristan
VLM
91
2
0
26 Nov 2024
SAM-MPA: Applying SAM to Few-shot Medical Image Segmentation using Mask
  Propagation and Auto-prompting
SAM-MPA: Applying SAM to Few-shot Medical Image Segmentation using Mask Propagation and Auto-prompting
Jie Xu
Xiaokang Li
Chengyu Yue
Yuanyuan Wang
Yi Guo
MedIm
86
1
0
26 Nov 2024
HEIE: MLLM-Based Hierarchical Explainable AIGC Image Implausibility
  Evaluator
HEIE: MLLM-Based Hierarchical Explainable AIGC Image Implausibility Evaluator
Fan Yang
Ru Zhen
Jinqiao Wang
Yanhao Zhang
Haoxiang Chen
Haonan Lu
Sicheng Zhao
Guiguang Ding
83
0
0
26 Nov 2024
Exploring Aleatoric Uncertainty in Object Detection via Vision
  Foundation Models
Exploring Aleatoric Uncertainty in Object Detection via Vision Foundation Models
Peng Cui
Guande He
Dan Zhang
Zhijie Deng
Yinpeng Dong
Jun Zhu
92
1
0
26 Nov 2024
Symmetry Strikes Back: From Single-Image Symmetry Detection to 3D
  Generation
Symmetry Strikes Back: From Single-Image Symmetry Detection to 3D Generation
Xiang Li
Zixuan Huang
Anh Thai
James M. Rehg
3DGS
87
0
0
26 Nov 2024
ΩSFormer: Dual-Modal Ω-like Super-Resolution Transformer
  Network for Cross-scale and High-accuracy Terraced Field Vectorization
  Extraction
ΩSFormer: Dual-Modal Ω-like Super-Resolution Transformer Network for Cross-scale and High-accuracy Terraced Field Vectorization Extraction
Chang Li
Yu Wang
Chuxu Zhang
Yongjun Zhang
61
0
0
26 Nov 2024
OpenAD: Open-World Autonomous Driving Benchmark for 3D Object Detection
OpenAD: Open-World Autonomous Driving Benchmark for 3D Object Detection
Zhongyu Xia
Jishuo Li
Zhiwei Lin
Xinhao Wang
Yansen Wang
Ming-Hsuan Yang
VLM
87
2
0
26 Nov 2024
CityWalker: Learning Embodied Urban Navigation from Web-Scale Videos
CityWalker: Learning Embodied Urban Navigation from Web-Scale Videos
Xinhao Liu
Jiajian Li
Yichen Jiang
Niranjan Sujay
Zheng Yang
Juexiao Zhang
John Abanes
Jing Zhang
Chen Feng
116
2
0
26 Nov 2024
GenDeg: Diffusion-based Degradation Synthesis for Generalizable All-In-One Image Restoration
GenDeg: Diffusion-based Degradation Synthesis for Generalizable All-In-One Image Restoration
Sudarshan Rajagopalan
Nithin Gopalakrishnan Nair
Jay N. Paranjape
Vishal M. Patel
DiffM
98
0
0
26 Nov 2024
SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation
SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation
Claudia Cuttano
Gabriele Trivigno
Gabriele Rosi
Carlo Masone
Giuseppe Averta
VOS
112
2
0
26 Nov 2024
Distilling Spectral Graph for Object-Context Aware Open-Vocabulary Semantic Segmentation
Distilling Spectral Graph for Object-Context Aware Open-Vocabulary Semantic Segmentation
Chanyoung Kim
Dayun Ju
Woojung Han
Ming-Hsuan Yang
Seong Jae Hwang
VLM
VOS
89
0
0
26 Nov 2024
vesselFM: A Foundation Model for Universal 3D Blood Vessel Segmentation
vesselFM: A Foundation Model for Universal 3D Blood Vessel Segmentation
Bastian Wittmann
Yannick Wattenberg
Tamaz Amiranashvili
Suprosanna Shit
Bjoern H. Menze
92
3
0
26 Nov 2024
Probing the Mid-level Vision Capabilities of Self-Supervised Learning
Probing the Mid-level Vision Capabilities of Self-Supervised Learning
Xuweiyi Chen
Markus Marks
Zezhou Cheng
89
0
0
25 Nov 2024
Open Vocabulary Monocular 3D Object Detection
Open Vocabulary Monocular 3D Object Detection
Jin Yao
Hao Gu
Xuweiyi Chen
Jiayun Wang
Zezhou Cheng
ObjD
VLM
78
3
0
25 Nov 2024
Diffusion Features for Zero-Shot 6DoF Object Pose Estimation
Diffusion Features for Zero-Shot 6DoF Object Pose Estimation
Bernd Von Gimborn
P. Ausserlechner
Markus Vincze
S. Thalhammer
DiffM
76
0
0
25 Nov 2024
Chat2SVG: Vector Graphics Generation with Large Language Models and
  Image Diffusion Models
Chat2SVG: Vector Graphics Generation with Large Language Models and Image Diffusion Models
Ronghuan Wu
Wanchao Su
Jing Liao
DiffM
79
1
0
25 Nov 2024
J-CaPA : Joint Channel and Pyramid Attention Improves Medical Image
  Segmentation
J-CaPA : Joint Channel and Pyramid Attention Improves Medical Image Segmentation
Marzia Binta Nizam
Marian Zlateva
James Davis
MedIm
69
0
0
25 Nov 2024
One Diffusion to Generate Them All
One Diffusion to Generate Them All
Duong H. Le
Tuan Pham
Sangho Lee
Christopher Clark
Aniruddha Kembhavi
Stephan Mandt
Ranjay Krishna
Jiasen Lu
VLM
84
5
0
25 Nov 2024
Phase-Informed Tool Segmentation for Manual Small-Incision Cataract
  Surgery
Phase-Informed Tool Segmentation for Manual Small-Incision Cataract Surgery
Bhuvan Sachdeva
Naren Akash
Tajamul Ashraf
Simon Muller
T. Schultz
M. Wintergerst
Niharika Singri Prasad
K. Murali
Mohit Jain
86
0
0
25 Nov 2024
Any3DIS: Class-Agnostic 3D Instance Segmentation by 2D Mask Tracking
Any3DIS: Class-Agnostic 3D Instance Segmentation by 2D Mask Tracking
P. Nguyen
Minh Luu
Anh Tran
Cuong Pham
K. Nguyen
3DPC
87
0
0
25 Nov 2024
Med-PerSAM: One-Shot Visual Prompt Tuning for Personalized Segment
  Anything Model in Medical Domain
Med-PerSAM: One-Shot Visual Prompt Tuning for Personalized Segment Anything Model in Medical Domain
Hangyul Yoon
Doohyuk Jang
JungEun Kim
Eunho Yang
VLM
MedIm
77
1
0
25 Nov 2024
Boosting 3D Object Generation through PBR Materials
Boosting 3D Object Generation through PBR Materials
Yitong Wang
Xudong Xu
Li Ma
Haozhao Wang
Bo Dai
83
3
0
25 Nov 2024
Language Driven Occupancy Prediction
Language Driven Occupancy Prediction
Zhu Yu
Bowen Pang
Lizhe Liu
Runmin Zhang
Qihao Peng
Maochun Luo
Sheng Yang
Mingxia Chen
Si-Yuan Cao
Hui-Liang Shen
97
2
0
25 Nov 2024
ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities
  through Tree-Based Image Exploration
ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration
Haozhan Shen
Kangjia Zhao
Tiancheng Zhao
Ruochen Xu
Zilun Zhang
Mingwei Zhu
Jianwei Yin
97
4
0
25 Nov 2024
Phys4DGen: Physics-Compliant 4D Generation with Multi-Material Composition Perception
Phys4DGen: Physics-Compliant 4D Generation with Multi-Material Composition Perception
Jiajing Lin
Zhenzhong Wang
Shu Jiang
Yongjie Hou
Min Jiang
Min Jiang
VGen
79
0
0
25 Nov 2024
VideoOrion: Tokenizing Object Dynamics in Videos
VideoOrion: Tokenizing Object Dynamics in Videos
Yicheng Feng
Yijiang Li
Wanpeng Zhang
Sipeng Zheng
Zongqing Lu
Sipeng Zheng
Zongqing Lu
109
1
0
25 Nov 2024
UNOPose: Unseen Object Pose Estimation with an Unposed RGB-D Reference Image
UNOPose: Unseen Object Pose Estimation with an Unposed RGB-D Reference Image
Xingyu Liu
Gu Wang
Ruida Zhang
Chenyangguang Zhang
F. Tombari
Xiangyang Ji
290
2
0
25 Nov 2024
ResCLIP: Residual Attention for Training-free Dense Vision-language
  Inference
ResCLIP: Residual Attention for Training-free Dense Vision-language Inference
Yuhang Yang
Jinhong Deng
Wen Li
Lixin Duan
VLM
83
0
0
24 Nov 2024
Bundle Adjusted Gaussian Avatars Deblurring
Bundle Adjusted Gaussian Avatars Deblurring
Muyao Niu
Yifan Zhan
Qingtian Zhu
Zechao Li
Wei Wang
Zhihang Zhong
Xingchen Sun
Yinqiang Zheng
3DGS
88
0
0
24 Nov 2024
OccludeNet: A Causal Journey into Mixed-View Actor-Centric Video Action
  Recognition under Occlusions
OccludeNet: A Causal Journey into Mixed-View Actor-Centric Video Action Recognition under Occlusions
Guanyu Zhou
Xiaohan Yu
Wenxin Huang
Xuemei Jia
Xian Zhong
Chia-Wen Lin
CML
81
0
0
24 Nov 2024
ROOT: VLM based System for Indoor Scene Understanding and Beyond
ROOT: VLM based System for Indoor Scene Understanding and Beyond
Yonghui Wang
Shi-Yong Chen
Zhenxing Zhou
Siyi Li
Haoran Li
Wengang Zhou
Haoyang Li
VLM
72
3
0
24 Nov 2024
AnySynth: Harnessing the Power of Image Synthetic Data Generation for
  Generalized Vision-Language Tasks
AnySynth: Harnessing the Power of Image Synthetic Data Generation for Generalized Vision-Language Tasks
Yuchen Li
Fan Ma
Yi Yang
DiffM
154
2
0
24 Nov 2024
AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea
AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea
Qifan Yu
Wei Chow
Zhongqi Yue
Kaihang Pan
Yang Wu
Xiaoyang Wan
Juncheng Billy Li
Siliang Tang
Hao Zhang
Yueting Zhuang
DiffM
112
17
0
24 Nov 2024
Self-Calibrated CLIP for Training-Free Open-Vocabulary Segmentation
Self-Calibrated CLIP for Training-Free Open-Vocabulary Segmentation
Sule Bai
Yong-Jin Liu
Yifei Han
Haoji Zhang
Yansong Tang
VLM
87
3
0
24 Nov 2024
Training an Open-Vocabulary Monocular 3D Object Detection Model without
  3D Data
Training an Open-Vocabulary Monocular 3D Object Detection Model without 3D Data
Rui Huang
Henry Zheng
Yan Wang
Zhuofan Xia
Marco Pavone
Gao Huang
3DPC
VLM
96
1
0
23 Nov 2024
Fine-Grained Open-Vocabulary Object Recognition via User-Guided
  Segmentation
Fine-Grained Open-Vocabulary Object Recognition via User-Guided Segmentation
Jinwoo Ahn
Hyeokjoon Kwon
Hwiyeon Yoo
ObjD
VLM
82
0
0
23 Nov 2024
CellPilot
CellPilot
Philipp Endres
Valentin Koch
Julia A. Schnabel
Carsten Marr
VLM
MedIm
76
0
0
23 Nov 2024
Large-Scale Text-to-Image Model with Inpainting is a Zero-Shot
  Subject-Driven Image Generator
Large-Scale Text-to-Image Model with Inpainting is a Zero-Shot Subject-Driven Image Generator
Chaehun Shin
Jooyoung Choi
Heeseung Kim
Sungroh Yoon
DiffM
94
8
0
23 Nov 2024
FINECAPTION: Compositional Image Captioning Focusing on Wherever You
  Want at Any Granularity
FINECAPTION: Compositional Image Captioning Focusing on Wherever You Want at Any Granularity
Hang Hua
Qing Liu
Lingzhi Zhang
Jing Shi
Zhifei Zhang
Yilin Wang
Jianming Zhang
Jiebo Luo
CoGe
VLM
103
6
0
23 Nov 2024
There is no SAMantics! Exploring SAM as a Backbone for Visual
  Understanding Tasks
There is no SAMantics! Exploring SAM as a Backbone for Visual Understanding Tasks
Miguel Espinosa
Chenhongyi Yang
Linus Ericsson
Jingyu Sun
Elliot J. Crowley
VLM
80
0
0
22 Nov 2024
Benchmarking the Robustness of Optical Flow Estimation to Corruptions
Benchmarking the Robustness of Optical Flow Estimation to Corruptions
Zhonghua Yi
Hao-miao Shi
Zhijie Xu
Yao Gao
Ze Wang
Yanmei Zhang
Kailun Yang
Kaiwei Wang
AAML
87
1
0
22 Nov 2024
Previous
123...202122...828384
Next