ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1405.0312
  4. Cited By
Microsoft COCO: Common Objects in Context

Microsoft COCO: Common Objects in Context

1 May 2014
Nayeon Lee
Michael Maire
Serge J. Belongie
Lubomir Bourdev
Ross B. Girshick
James Hays
Pietro Perona
Deva Ramanan
C. L. Zitnick
Piotr Dollár
    ObjD
ArXivPDFHTML

Papers citing "Microsoft COCO: Common Objects in Context"

50 / 652 papers shown
Title
RGB-Th-Bench: A Dense benchmark for Visual-Thermal Understanding of Vision Language Models
RGB-Th-Bench: A Dense benchmark for Visual-Thermal Understanding of Vision Language Models
Mehdi Moshtaghi
Siavash H. Khajavi
Joni Pajarinen
VLM
77
0
0
25 Mar 2025
CQ-DINO: Mitigating Gradient Dilution via Category Queries for Vast Vocabulary Object Detection
CQ-DINO: Mitigating Gradient Dilution via Category Queries for Vast Vocabulary Object Detection
Zhichao Sun
Huazhang Hu
Yidong Ma
Gang Liu
Nemo Chen
Xu Tang
Feng-Long Xie
Yongchao Xu
ObjD
65
0
0
24 Mar 2025
BackMix: Regularizing Open Set Recognition by Removing Underlying Fore-Background Priors
BackMix: Regularizing Open Set Recognition by Removing Underlying Fore-Background Priors
Yu Wang
Junxian Mu
Hongzhi Huang
Qilong Wang
Pengfei Zhu
Q. Hu
115
1
0
22 Mar 2025
Missing Target-Relevant Information Prediction with World Model for Accurate Zero-Shot Composed Image Retrieval
Missing Target-Relevant Information Prediction with World Model for Accurate Zero-Shot Composed Image Retrieval
Yuanmin Tang
Jing Yu
Keke Gai
Jiamin Zhuang
Gang Xiong
Gaopeng Gou
Qi Wu
VGen
100
2
0
21 Mar 2025
Generative Modeling of Class Probability for Multi-Modal Representation Learning
Generative Modeling of Class Probability for Multi-Modal Representation Learning
Jungkyoo Shin
Bumsoo Kim
Eunwoo Kim
73
1
0
21 Mar 2025
How to Train Your Dragon: Automatic Diffusion-Based Rigging for Characters with Diverse Topologies
How to Train Your Dragon: Automatic Diffusion-Based Rigging for Characters with Diverse Topologies
Zeqi Gu
Difan Liu
Timothy Langlois
Matthew Fisher
Abe Davis
DiffM
3DH
75
0
0
19 Mar 2025
Machine Unlearning in Hyperbolic vs. Euclidean Multimodal Contrastive Learning: Adapting Alignment Calibration to MERU
Machine Unlearning in Hyperbolic vs. Euclidean Multimodal Contrastive Learning: Adapting Alignment Calibration to MERU
Àlex Pujol Vidal
Sergio Escalera
Kamal Nasrollahi
T. Moeslund
MU
89
0
0
19 Mar 2025
TULIP: Towards Unified Language-Image Pretraining
TULIP: Towards Unified Language-Image Pretraining
Zineng Tang
Long Lian
Seun Eisape
Xudong Wang
Roei Herzig
Adam Yala
Alane Suhr
Trevor Darrell
David M. Chan
VLM
CLIP
MLLM
120
5
0
19 Mar 2025
Continual Multimodal Contrastive Learning
Continual Multimodal Contrastive Learning
Xiaohao Liu
Xiaobo Xia
See-Kiong Ng
Tat-Seng Chua
CLL
132
1
0
19 Mar 2025
8-Calves Image dataset
8-Calves Image dataset
Xuyang Fang
S. Hannuna
Neill D. F. Campbell
307
0
0
17 Mar 2025
ClearSight: Visual Signal Enhancement for Object Hallucination Mitigation in Multimodal Large language Models
ClearSight: Visual Signal Enhancement for Object Hallucination Mitigation in Multimodal Large language Models
Hao Yin
Guangzong Si
Zilei Wang
277
0
0
17 Mar 2025
Action tube generation by person query matching for spatio-temporal action detection
Action tube generation by person query matching for spatio-temporal action detection
Kazuki Omi
Jion Oshima
Toru Tamaki
98
0
0
17 Mar 2025
Segment Any-Quality Images with Generative Latent Space Enhancement
Segment Any-Quality Images with Generative Latent Space Enhancement
Guangqian Guo
Yoong Guo
Xuehui Yu
Wenbo Li
Yaoxing Wang
Shan Gao
VLM
106
0
0
16 Mar 2025
UniVG: A Generalist Diffusion Model for Unified Image Generation and Editing
UniVG: A Generalist Diffusion Model for Unified Image Generation and Editing
Tsu-Jui Fu
Yusu Qian
Chen Chen
Wenze Hu
Zhe Gan
Yue Yang
130
2
0
16 Mar 2025
SEAL: Semantic Aware Image Watermarking
SEAL: Semantic Aware Image Watermarking
Kasra Arabi
R. Teal Witter
Chinmay Hegde
Niv Cohen
WIGM
AAML
109
0
0
15 Mar 2025
APLA: A Simple Adaptation Method for Vision Transformers
APLA: A Simple Adaptation Method for Vision Transformers
Moein Sorkhei
Emir Konuk
Kevin Smith
Christos Matsoukas
64
0
0
14 Mar 2025
Cyclic Contrastive Knowledge Transfer for Open-Vocabulary Object Detection
Cyclic Contrastive Knowledge Transfer for Open-Vocabulary Object Detection
Chuhan Zhang
Chaoyang Zhu
Pingcheng Dong
Long Chen
Dong Zhang
ObjD
VLM
362
0
0
14 Mar 2025
FlowTok: Flowing Seamlessly Across Text and Image Tokens
FlowTok: Flowing Seamlessly Across Text and Image Tokens
Ju He
Qihang Yu
Qihao Liu
Liang-Chieh Chen
78
1
0
13 Mar 2025
GroundingSuite: Measuring Complex Multi-Granular Pixel Grounding
GroundingSuite: Measuring Complex Multi-Granular Pixel Grounding
R. Hu
Lianghui Zhu
Yuxuan Zhang
Tianheng Cheng
Lei Liu
Heng Liu
Longjin Ran
Xiaoxin Chen
Wenyu Liu
Xinggang Wang
ObjD
102
0
0
13 Mar 2025
Hoi2Anomaly: An Explainable Anomaly Detection Approach Guided by Human-Object Interaction
Hoi2Anomaly: An Explainable Anomaly Detection Approach Guided by Human-Object Interaction
Yuhan Wang
Cheng Liu
Daou Zhang
Weichao Wu
49
0
0
13 Mar 2025
PlanGen: Towards Unified Layout Planning and Image Generation in Auto-Regressive Vision Language Models
PlanGen: Towards Unified Layout Planning and Image Generation in Auto-Regressive Vision Language Models
Runze He
Bo Cheng
Yuhang Ma
Qingxiang Jia
Shanyuan Liu
Ao Ma
Xiaoyu Wu
Liebucha Wu
Dawei Leng
Yuhui Yin
DiffM
VLM
92
0
0
13 Mar 2025
RewardSDS: Aligning Score Distillation via Reward-Weighted Sampling
Itay Chachy
Guy Yariv
Sagie Benaim
352
0
0
12 Mar 2025
Implicit Contrastive Representation Learning with Guided Stop-gradient
Byeongchan Lee
Sehyun Lee
SSL
148
2
0
12 Mar 2025
NVP-HRI: Zero Shot Natural Voice and Posture-based Human-Robot Interaction via Large Language Model
Yuzhi Lai
Shenghai Yuan
Youssef Nassar
Mingyu Fan
T. Weber
Matthias Rätsch
LM&Ro
75
3
0
12 Mar 2025
DitHub: A Modular Framework for Incremental Open-Vocabulary Object Detection
DitHub: A Modular Framework for Incremental Open-Vocabulary Object Detection
Chiara Cappellino
Gianluca Mancusi
Matteo Mosconi
Angelo Porrello
Simone Calderara
Rita Cucchiara
ObjD
VLM
105
0
0
12 Mar 2025
Mapping fMRI Signal and Image Stimuli in an Artificial Neural Network Latent Space: Bringing Artificial and Natural Minds Together
Mapping fMRI Signal and Image Stimuli in an Artificial Neural Network Latent Space: Bringing Artificial and Natural Minds Together
Cesare Maria Dalbagno
Manuel de Castro Ribeiro Jardim
Mihnea Angheluţă
86
0
0
12 Mar 2025
Aligning Text to Image in Diffusion Models is Easier Than You Think
Aligning Text to Image in Diffusion Models is Easier Than You Think
J. Lee
Byunghee Cha
Jeongsol Kim
Jong Chul Ye
66
0
0
11 Mar 2025
MaskAttn-UNet: A Mask Attention-Driven Framework for Universal Low-Resolution Image Segmentation
MaskAttn-UNet: A Mask Attention-Driven Framework for Universal Low-Resolution Image Segmentation
Anzhe Cheng
Chenzhong Yin
Yu Chang
Heng Ping
Shixuan Li
Shahin Nazarian
Paul Bogdan
SSeg
142
0
0
11 Mar 2025
Referring to Any Person
Referring to Any Person
Qing Jiang
Lin Wu
Zhaoyang Zeng
Tianhe Ren
Yuda Xiong
Yihao Chen
Qin Liu
Lei Zhang
346
0
0
11 Mar 2025
Attention Hijackers: Detect and Disentangle Attention Hijacking in LVLMs for Hallucination Mitigation
Beitao Chen
Xinyu Lyu
Lianli Gao
Jingkuan Song
Jikang Cheng
107
1
0
11 Mar 2025
DyArtbank: Diverse Artistic Style Transfer via Pre-trained Stable Diffusion and Dynamic Style Prompt Artbank
Zhanjie Zhang
Quanwei Zhang
Guangyuan Li
Junsheng Luan
Mengyuan Yang
Yun Wang
Lei Zhao
DiffM
97
4
0
11 Mar 2025
FaceID-6M: A Large-Scale, Open-Source FaceID Customization Dataset
FaceID-6M: A Large-Scale, Open-Source FaceID Customization Dataset
Shuhe Wang
Xiaoya Li
Jiwei Li
G. Wang
Xiaofei Sun
...
Han Qiu
Mo Yu
Shengjie Shen
Tianwei Zhang
Eduard H. Hovy
VLM
70
1
0
10 Mar 2025
A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning
A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning
Xin Wen
Bingchen Zhao
Yilun Chen
Jiangmiao Pang
Xiaojuan Qi
LM&Ro
94
0
0
10 Mar 2025
Think Before You Segment: High-Quality Reasoning Segmentation with GPT Chain of Thoughts
Think Before You Segment: High-Quality Reasoning Segmentation with GPT Chain of Thoughts
Shiu-hong Kao
Yu-Wing Tai
Chi-Keung Tang
LRM
MLLM
117
1
0
10 Mar 2025
SimROD: A Simple Baseline for Raw Object Detection with Global and Local Enhancements
SimROD: A Simple Baseline for Raw Object Detection with Global and Local Enhancements
Haiyang Xie
Xi Shen
Shihua Huang
Qirui Wang
Zheng Wang
55
0
0
10 Mar 2025
PLADIS: Pushing the Limits of Attention in Diffusion Models at Inference Time by Leveraging Sparsity
PLADIS: Pushing the Limits of Attention in Diffusion Models at Inference Time by Leveraging Sparsity
Kwanyoung Kim
Byeongsu Sim
DiffM
VLM
99
0
0
10 Mar 2025
AnywhereDoor: Multi-Target Backdoor Attacks on Object Detection
Jialin Lu
Junjie Shan
Ziqi Zhao
Ka-Ho Chow
AAML
96
0
0
09 Mar 2025
Advancing Multimodal In-Context Learning in Large Vision-Language Models with Task-aware Demonstrations
Advancing Multimodal In-Context Learning in Large Vision-Language Models with Task-aware Demonstrations
Yanshu Li
82
2
0
05 Mar 2025
Mocap-2-to-3: Lifting 2D Diffusion-Based Pretrained Models for 3D Motion Capture
Mocap-2-to-3: Lifting 2D Diffusion-Based Pretrained Models for 3D Motion Capture
Zhumei Wang
Zechen Hu
Ruoxi Guo
Huaijin Pi
Ziyong Feng
Sida Peng
Xiaowei Zhou
115
0
0
05 Mar 2025
Predicting Team Performance from Communications in Simulated Search-and-Rescue
Ali Jalal-Kamali
Nikolos Gurney
David Pynadath
AI4TS
127
14
0
05 Mar 2025
Exploring Token-Level Augmentation in Vision Transformer for Semi-Supervised Semantic Segmentation
Dengke Zhang
Quan Tang
Fagui Liu
C. L. Philip Chen
Haiqing Mei
ViT
136
0
0
04 Mar 2025
Out-of-Distribution Segmentation in Autonomous Driving: Problems and State of the Art
Out-of-Distribution Segmentation in Autonomous Driving: Problems and State of the Art
Youssef Shoeb
Azarm Nowzad
Hanno Gottschalk
UQCV
139
2
0
04 Mar 2025
Enhancing Vision-Language Compositional Understanding with Multimodal Synthetic Data
Enhancing Vision-Language Compositional Understanding with Multimodal Synthetic Data
Haoxin Li
Boyang Li
CoGe
88
0
0
03 Mar 2025
Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation
Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation
Tiansheng Wen
Yifei Wang
Zequn Zeng
Zhong Peng
Yudi Su
Xinyang Liu
Bo Chen
Hongwei Liu
Stefanie Jegelka
Chenyu You
CLL
106
3
0
03 Mar 2025
Towards Improved Text-Aligned Codebook Learning: Multi-Hierarchical Codebook-Text Alignment with Long Text
Guotao Liang
Baoquan Zhang
Zhiyuan Wen
Junteng Zhao
Yunming Ye
Kola Ye
Yao He
65
0
0
03 Mar 2025
Knowledge Bridger: Towards Training-free Missing Multi-modality Completion
Knowledge Bridger: Towards Training-free Missing Multi-modality Completion
Guanzhou Ke
Shengfeng He
Xinyu Wang
Bo Wang
Guoqing Chao
Yize Zhang
Yi Xie
HeXing Su
103
0
0
27 Feb 2025
WalnutData: A UAV Remote Sensing Dataset of Green Walnuts and Model Evaluation
WalnutData: A UAV Remote Sensing Dataset of Green Walnuts and Model Evaluation
Mingjie Wu
Chenggui Yang
Huihua Wang
Chen Xue
Yibo Wang
...
Yuqi Han
R. Li
Lijun Yun
Zaiqing Chen
Siyang Song
96
0
0
27 Feb 2025
OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels
OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels
Meng Lou
Yizhou Yu
164
1
0
27 Feb 2025
Model Adaptation: Unsupervised Domain Adaptation without Source Data
Model Adaptation: Unsupervised Domain Adaptation without Source Data
Rui Li
Qianfen Jiao
Wenming Cao
Hau-San Wong
Si Wu
OOD
158
482
0
26 Feb 2025
Learning Structure-Supporting Dependencies via Keypoint Interactive Transformer for General Mammal Pose Estimation
Learning Structure-Supporting Dependencies via Keypoint Interactive Transformer for General Mammal Pose Estimation
Tianyang Xu
Jiyong Rao
Xiaoning Song
Zhenhua Feng
Xiao Wu
ViT
133
1
0
25 Feb 2025
Previous
123456...121314
Next