ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.02643
  4. Cited By
Segment Anything

Segment Anything

5 April 2023
A. Kirillov
Eric Mintun
Nikhila Ravi
Hanzi Mao
Chloe Rolland
Laura Gustafson
Tete Xiao
Spencer Whitehead
Alexander C. Berg
Wan-Yen Lo
Piotr Dollár
Ross B. Girshick
    MLLM
    VLM
ArXivPDFHTML

Papers citing "Segment Anything"

50 / 4,200 papers shown
Title
VIVID-10M: A Dataset and Baseline for Versatile and Interactive Video
  Local Editing
VIVID-10M: A Dataset and Baseline for Versatile and Interactive Video Local Editing
Jiahao Hu
Tianxiong Zhong
Xuebo Wang
Boyuan Jiang
Xingye Tian
Fei Yang
Pengfei Wan
Di Zhang
VGen
74
2
0
22 Nov 2024
Optimized Vessel Segmentation: A Structure-Agnostic Approach with Small
  Vessel Enhancement and Morphological Correction
Optimized Vessel Segmentation: A Structure-Agnostic Approach with Small Vessel Enhancement and Morphological Correction
Dongning Song
Weijian Huang
Jiarun Liu
Md Jahidul Islam
Hao Yang
Shanshan Wang
82
0
0
22 Nov 2024
Aim My Robot: Precision Local Navigation to Any Object
Aim My Robot: Precision Local Navigation to Any Object
Xiangyun Meng
Xuning Yang
Sanghun Jung
F. Ramos
Srid Sadhan Jujjavarapu
Sanjoy Paul
Dieter Fox
94
1
0
22 Nov 2024
Foundation Cures Personalization: Improving Personalized Models' Prompt Consistency via Hidden Foundation Knowledge
Foundation Cures Personalization: Improving Personalized Models' Prompt Consistency via Hidden Foundation Knowledge
Yiyang Cai
Zhengkai Jiang
Yang Liu
Chunyang Jiang
Wei Xue
Wenhan Luo
Yike Guo
101
0
0
22 Nov 2024
Panther: Illuminate the Sight of Multimodal LLMs with Instruction-Guided
  Visual Prompts
Panther: Illuminate the Sight of Multimodal LLMs with Instruction-Guided Visual Prompts
Honglin Li
Yuting Gao
Chenglu Zhu
Jingdong Chen
M. Yang
Lin Yang
MLLM
100
0
0
21 Nov 2024
Sli2Vol+: Segmenting 3D Medical Images Based on an Object Estimation
  Guided Correspondence Flow Network
Sli2Vol+: Segmenting 3D Medical Images Based on an Object Estimation Guided Correspondence Flow Network
Delin An
Pengfei Gu
Milan Sonka
Chaoli Wang
Danny Chen
87
1
0
21 Nov 2024
Segment Any Class (SAC): Multi-Class Few-Shot Semantic Segmentation via
  Class Region Proposals
Segment Any Class (SAC): Multi-Class Few-Shot Semantic Segmentation via Class Region Proposals
Hussni Mohd Zakir
Eric Tatt Wei Ho
VLM
84
0
0
21 Nov 2024
XMask3D: Cross-modal Mask Reasoning for Open Vocabulary 3D Semantic
  Segmentation
XMask3D: Cross-modal Mask Reasoning for Open Vocabulary 3D Semantic Segmentation
Ziyi Wang
Yufei Wang
Xumin Yu
Jie Zhou
Jiwen Lu
74
0
0
20 Nov 2024
Cyborg Insect Factory: Automatic Assembly System to Build up
  Insect-computer Hybrid Robot Based on Vision-guided Robotic Arm Manipulation
  of Custom Bipolar Electrodes
Cyborg Insect Factory: Automatic Assembly System to Build up Insect-computer Hybrid Robot Based on Vision-guided Robotic Arm Manipulation of Custom Bipolar Electrodes
Qifeng Lin
Nghia Vuong
Kewei Song
Phuoc Thanh Tran Ngoc
Greg Angelo Gonzales Nonato
Hirotaka Sato
67
0
0
20 Nov 2024
Adapting Vision Foundation Models for Robust Cloud Segmentation in
  Remote Sensing Images
Adapting Vision Foundation Models for Robust Cloud Segmentation in Remote Sensing Images
Xuechao Zou
Shun Zhang
Kai Li
Shiying Wang
Junliang Xing
Lei Jin
Congyan Lang
Pin Tao
68
1
0
20 Nov 2024
Chanel-Orderer: A Channel-Ordering Predictor for Tri-Channel Natural
  Images
Chanel-Orderer: A Channel-Ordering Predictor for Tri-Channel Natural Images
Shen Li
Lei Jiang
Wei Wang
Hongwei Hu
Liang Li
78
0
0
20 Nov 2024
Find Any Part in 3D
Find Any Part in 3D
Ziqi Ma
Yisong Yue
Georgia Gkioxari
3DPC
115
3
0
20 Nov 2024
Interactive Medical Image Segmentation: A Benchmark Dataset and Baseline
Interactive Medical Image Segmentation: A Benchmark Dataset and Baseline
Junlong Cheng
Bin Fu
Jin Ye
Guoan Wang
Tianbin Li
...
Jianfei Chen
Jiajian Li
Yanzhou Su
Min Zhu
Junjun He
VLM
89
4
0
19 Nov 2024
CV-Cities: Advancing Cross-View Geo-Localization in Global Cities
CV-Cities: Advancing Cross-View Geo-Localization in Global Cities
Gaoshuang Huang
Yang Zhou
Luying Zhao
Wenjian Gan
75
2
0
19 Nov 2024
CLIC: Contrastive Learning Framework for Unsupervised Image Complexity Representation
CLIC: Contrastive Learning Framework for Unsupervised Image Complexity Representation
Shipeng Liu
Liang Zhao
Dengfeng Chen
SSL
118
1
0
19 Nov 2024
TrojanRobot: Physical-World Backdoor Attacks Against VLM-based Robotic Manipulation
Xiaobei Wang
Hewen Pan
Hangtao Zhang
Minghui Li
Shengshan Hu
...
Peijin Guo
Yichen Wang
Wei Wan
Aishan Liu
L. Zhang
AAML
93
7
0
18 Nov 2024
Leveraging Computational Pathology AI for Noninvasive Optical Imaging Analysis Without Retraining
Danny Barash
Emilie Manning
Aidan Van Vleck
Omri Hirsch
Kyi Lei Aye
...
Sumaira Aasi
Kerri E. Rieger
Kavita Y. Sarin
Oren Freifeld
Yonatan Winetraub
85
0
0
18 Nov 2024
VLN-Game: Vision-Language Equilibrium Search for Zero-Shot Semantic Navigation
Bangguo Yu
Yuzhen Liu
Lei Han
Hamidreza Kasaei
Tingguang Li
M. Cao
LM&Ro
83
3
0
18 Nov 2024
SignEye: Traffic Sign Interpretation from Vehicle First-Person View
Chuang Yang
Xu Han
T. Han
Yuejiao Su
Junyu Gao
Hongyuan Zhang
Yi Wang
Lap-Pui Chau
89
2
0
18 Nov 2024
IKEA Manuals at Work: 4D Grounding of Assembly Instructions on Internet Videos
Yunong Liu
Cristobal Eyzaguirre
Manling Li
Shubh Khanna
Juan Carlos Niebles
Vineeth Ravi
Saumitra Mishra
Weiyu Liu
Jiajun Wu
88
1
0
18 Nov 2024
Text-guided Zero-Shot Object Localization
Jingjing Wang
Xinglin Piao
Zongzhi Gao
Bo Li
Yong Zhang
Baocai Yin
79
0
0
18 Nov 2024
Video-to-Task Learning via Motion-Guided Attention for Few-Shot Action Recognition
Hanyu Guo
Wanchuan Yu
Suzhou Que
Kaiwen Du
Yan Yan
Hanzi Wang
75
1
0
18 Nov 2024
SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking
  with Motion-Aware Memory
SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory
Cheng-Yen Yang
Hsiang-Wei Huang
Wenhao Chai
Zhongyu Jiang
Lei Li
VLM
102
18
0
18 Nov 2024
The Sound of Water: Inferring Physical Properties from Pouring Liquids
Piyush Bagad
Makarand Tapaswi
Cees G. M. Snoek
Andrew Zisserman
58
0
0
18 Nov 2024
ITACLIP: Boosting Training-Free Semantic Segmentation with Image, Text, and Architectural Enhancements
ITACLIP: Boosting Training-Free Semantic Segmentation with Image, Text, and Architectural Enhancements
M. Arda Aydın
Efe Mert Çırpar
Elvin Abdinli
Gözde B. Ünal
Y. Sahin
VLM
81
0
0
18 Nov 2024
Zero-Shot Automatic Annotation and Instance Segmentation using LLM-Generated Datasets: Eliminating Field Imaging and Manual Annotation for Deep Learning Model Development
Ranjan Sapkota
Achyut Paudel
Manoj Karkee
SyDa
71
7
0
18 Nov 2024
Person Segmentation and Action Classification for Multi-Channel Hemisphere Field of View LiDAR Sensors
Svetlana Seliunina
Artem Otelepko
Raphael Memmesheimer
Sven Behnke
46
0
0
17 Nov 2024
StableV2V: Stablizing Shape Consistency in Video-to-Video Editing
Chang-Shu Liu
Rui Li
Kaidong Zhang
Yunwei Lan
Dong Liu
DiffM
VGen
63
3
0
17 Nov 2024
DGS-SLAM: Gaussian Splatting SLAM in Dynamic Environment
Mangyu Kong
Jaewon Lee
Seongwon Lee
Euntai Kim
3DGS
37
1
0
16 Nov 2024
AllRestorer: All-in-One Transformer for Image Restoration under Composite Degradations
J. Mao
Yue Yang
Xuesong Yin
Ling Shao
Hao Tang
40
0
0
16 Nov 2024
GeoGround: A Unified Large Vision-Language Model for Remote Sensing Visual Grounding
GeoGround: A Unified Large Vision-Language Model for Remote Sensing Visual Grounding
Yue Zhou
Mengcheng Lan
Xiang Li
Yiping Ke
Yiping Ke
Xue Jiang
Qingyun Li
Xue Yang
Wayne Zhang
ObjD
VLM
119
5
0
16 Nov 2024
SPARS3R: Semantic Prior Alignment and Regularization for Sparse 3D
  Reconstruction
SPARS3R: Semantic Prior Alignment and Regularization for Sparse 3D Reconstruction
Yutao Tang
Y. Guo
Deming Li
Cheng-Fang Peng
3DGS
82
0
0
15 Nov 2024
Learning Generalizable 3D Manipulation With 10 Demonstrations
Learning Generalizable 3D Manipulation With 10 Demonstrations
Yu Ren
Yang Cong
Ronghan Chen
Jiahao Long
SSL
66
1
0
15 Nov 2024
SEAGULL: No-reference Image Quality Assessment for Regions of Interest
  via Vision-Language Instruction Tuning
SEAGULL: No-reference Image Quality Assessment for Regions of Interest via Vision-Language Instruction Tuning
Zhaoyu Chen
Juan Wang
Wen Wang
Sunhan Xu
Hang Xiong
...
Jian Guo
Shuxun Wang
Chun Yuan
Bing Li
Weiming Hu
VLM
55
2
0
15 Nov 2024
CoSAM: Self-Correcting SAM for Domain Generalization in 2D Medical Image
  Segmentation
CoSAM: Self-Correcting SAM for Domain Generalization in 2D Medical Image Segmentation
Yihang Fu
Zhaoyu Chen
Yiwen Ye
Xingliang Lei
Zhisong Wang
Yong-quan Xia
VLM
MedIm
37
4
0
15 Nov 2024
CorrCLIP: Reconstructing Correlations in CLIP with Off-the-Shelf
  Foundation Models for Open-Vocabulary Semantic Segmentation
CorrCLIP: Reconstructing Correlations in CLIP with Off-the-Shelf Foundation Models for Open-Vocabulary Semantic Segmentation
Dengke Zhang
Fagui Liu
Quan Tang
VLM
62
1
0
15 Nov 2024
ColorEdit: Training-free Image-Guided Color editing with diffusion model
ColorEdit: Training-free Image-Guided Color editing with diffusion model
Xingxi Yin
Zhi Li
Jingfeng Zhang
Chenglin Li
Yin Zhang
DiffM
59
0
0
15 Nov 2024
Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level
Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level
Andong Deng
Tongjia Chen
Shoubin Yu
Taojiannan Yang
Lincoln Spencer
Yapeng Tian
Ajmal Mian
Joey Tianyi Zhou
Chen Chen
LRM
68
1
0
15 Nov 2024
MFP3D: Monocular Food Portion Estimation Leveraging 3D Point Clouds
MFP3D: Monocular Food Portion Estimation Leveraging 3D Point Clouds
Jinge Ma
Xiaoyan Zhang
Gautham Vinod
S. Raghavan
Jiangpeng He
F. Zhu
57
1
0
14 Nov 2024
Assessing the Performance of the DINOv2 Self-supervised Learning Vision
  Transformer Model for the Segmentation of the Left Atrium from MRI Images
Assessing the Performance of the DINOv2 Self-supervised Learning Vision Transformer Model for the Segmentation of the Left Atrium from MRI Images
Bipasha Kundu
Bidur Khanal
R. Simon
Cristian A. Linte
MedIm
28
2
0
14 Nov 2024
LLV-FSR: Exploiting Large Language-Vision Prior for Face
  Super-resolution
LLV-FSR: Exploiting Large Language-Vision Prior for Face Super-resolution
Chenyang Wang
Wenjie An
Kui Jiang
Xianming Liu
Junjun Jiang
CVBM
38
0
0
14 Nov 2024
Harnessing Vision Foundation Models for High-Performance, Training-Free
  Open Vocabulary Segmentation
Harnessing Vision Foundation Models for High-Performance, Training-Free Open Vocabulary Segmentation
Yuheng Shi
Minjing Dong
Chang Xu
VLM
48
1
0
14 Nov 2024
VidMan: Exploiting Implicit Dynamics from Video Diffusion Model for
  Effective Robot Manipulation
VidMan: Exploiting Implicit Dynamics from Video Diffusion Model for Effective Robot Manipulation
Youpeng Wen
Junfan Lin
Bo Li
Jiawei Han
Hang Xu
Shen Zhao
Xiaodan Liang
VGen
DiffM
45
2
0
14 Nov 2024
Spider: Any-to-Many Multimodal LLM
Spider: Any-to-Many Multimodal LLM
Jinxiang Lai
Jie Zhang
Jun Liu
Jian Li
Xiaocheng Lu
Song Guo
MLLM
72
2
0
14 Nov 2024
Physics Informed Distillation for Diffusion Models
Physics Informed Distillation for Diffusion Models
Joshua Tian Jin Tee
Kang Zhang
Hee Suk Yoon
Dhananjaya N. Gowda
Chanwoo Kim
Chang D. Yoo
DiffM
70
4
0
13 Nov 2024
ReMP: Reusable Motion Prior for Multi-domain 3D Human Pose Estimation
  and Motion Inbetweening
ReMP: Reusable Motion Prior for Multi-domain 3D Human Pose Estimation and Motion Inbetweening
Hojun Jang
Y. Kim
3DH
38
0
0
13 Nov 2024
Grounded Video Caption Generation
Grounded Video Caption Generation
Evangelos Kazakos
Cordelia Schmid
Josef Sivic
46
0
0
12 Nov 2024
Semi-Truths: A Large-Scale Dataset of AI-Augmented Images for Evaluating
  Robustness of AI-Generated Image detectors
Semi-Truths: A Large-Scale Dataset of AI-Augmented Images for Evaluating Robustness of AI-Generated Image detectors
Anisha Pal
Julia Kruk
Mansi Phute
Manognya Bhattaram
Diyi Yang
Duen Horng Chau
Judy Hoffman
AAML
50
2
0
12 Nov 2024
MSEG-VCUQ: Multimodal SEGmentation with Enhanced Vision Foundation Models, Convolutional Neural Networks, and Uncertainty Quantification for High-Speed Video Phase Detection Data
MSEG-VCUQ: Multimodal SEGmentation with Enhanced Vision Foundation Models, Convolutional Neural Networks, and Uncertainty Quantification for High-Speed Video Phase Detection Data
Chika Maduabuchi
Ericmoore Jossou
Matteo Bucci
45
0
0
12 Nov 2024
Commissioning An All-Sky Infrared Camera Array for Detection Of Airborne Objects
Commissioning An All-Sky Infrared Camera Array for Detection Of Airborne Objects
Laura Dominé
Ankit Biswas
Richard Cloete
Alex Delacroix
Andriy Fedorenko
...
Mike Prior
Forrest Schultz
Matthew Szenher
W. Watters
Abby White
26
1
0
12 Nov 2024
Previous
123...212223...828384
Next