ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1405.0312
  4. Cited By
Microsoft COCO: Common Objects in Context

Microsoft COCO: Common Objects in Context

1 May 2014
Nayeon Lee
Michael Maire
Serge J. Belongie
Lubomir Bourdev
Ross B. Girshick
James Hays
Pietro Perona
Deva Ramanan
C. L. Zitnick
Piotr Dollár
    ObjD
ArXivPDFHTML

Papers citing "Microsoft COCO: Common Objects in Context"

50 / 652 papers shown
Title
Unveiling Impact of Frequency Components on Membership Inference Attacks for Diffusion Models
Unveiling Impact of Frequency Components on Membership Inference Attacks for Diffusion Models
Puwei Lian
Yujun Cai
Songze Li
23
0
0
27 May 2025
Mitigating Hallucination in Large Vision-Language Models via Adaptive Attention Calibration
Mitigating Hallucination in Large Vision-Language Models via Adaptive Attention Calibration
Mehrdad Fazli
Bowen Wei
Ziwei Zhu
VLM
37
0
0
27 May 2025
ISAC: Training-Free Instance-to-Semantic Attention Control for Improving Multi-Instance Generation
ISAC: Training-Free Instance-to-Semantic Attention Control for Improving Multi-Instance Generation
Sanghyun Jo
Wooyeol Lee
Ziseok Lee
Kyungsu Kim
28
0
0
27 May 2025
Locality-Aware Zero-Shot Human-Object Interaction Detection
Locality-Aware Zero-Shot Human-Object Interaction Detection
Sanghyun Kim
Deunsol Jung
Minsu Cho
VLM
87
0
0
26 May 2025
What You Perceive Is What You Conceive: A Cognition-Inspired Framework for Open Vocabulary Image Segmentation
What You Perceive Is What You Conceive: A Cognition-Inspired Framework for Open Vocabulary Image Segmentation
Jianghang Lin
Yue Hu
Jiangtao Shen
Yunhang Shen
Liujuan Cao
Shengchuan Zhang
Chia-Wen Lin
ObjD
VLM
49
0
0
26 May 2025
Zero-Shot Pseudo Labels Generation Using SAM and CLIP for Semi-Supervised Semantic Segmentation
Zero-Shot Pseudo Labels Generation Using SAM and CLIP for Semi-Supervised Semantic Segmentation
Nagito Saito
Shintaro Ito
Koichi Ito
T. Aoki
VLM
MedIm
28
0
0
26 May 2025
What Changed? Detecting and Evaluating Instruction-Guided Image Edits with Multimodal Large Language Models
What Changed? Detecting and Evaluating Instruction-Guided Image Edits with Multimodal Large Language Models
Lorenzo Baraldi
Davide Bucciarelli
Federico Betti
Marcella Cornia
Lorenzo Baraldi
N. Sebe
Rita Cucchiara
35
0
0
26 May 2025
Deformable Attentive Visual Enhancement for Referring Segmentation Using Vision-Language Model
Deformable Attentive Visual Enhancement for Referring Segmentation Using Vision-Language Model
Alaa Dalaq
Muzammil Behzad
VLM
20
0
0
25 May 2025
DocMMIR: A Framework for Document Multi-modal Information Retrieval
DocMMIR: A Framework for Document Multi-modal Information Retrieval
Zirui Li
Siwei Wu
Xingyu Wang
Yi Zhou
Yizhi Li
Chenghua Lin
VLM
7
0
0
25 May 2025
Syn3DTxt: Embedding 3D Cues for Scene Text Generation
Syn3DTxt: Embedding 3D Cues for Scene Text Generation
Li-Syun Hsiung
Jun-Kai Tu
Kuan-Wu Chu
Yu-Hsuan Chiu
Yan-Tsung Peng
Sheng-Luen Chung
Gee-Sern Jison Hsu
10
0
0
24 May 2025
Reasoning Segmentation for Images and Videos: A Survey
Reasoning Segmentation for Images and Videos: A Survey
Yiqing Shen
Chenjia Li
Fei Xiong
Jeong-O Jeong
Tianpeng Wang
Michael Latman
Mathias Unberath
VOS
55
0
0
24 May 2025
EvdCLIP: Improving Vision-Language Retrieval with Entity Visual Descriptions from Large Language Models
EvdCLIP: Improving Vision-Language Retrieval with Entity Visual Descriptions from Large Language Models
G. MEng
Sunan He
Jinpeng Wang
Tao Dai
Letian Zhang
Jieming Zhu
Qing Li
Gang Wang
Rui Zhang
Yong Jiang
VLM
115
0
0
24 May 2025
Segment Anyword: Mask Prompt Inversion for Open-Set Grounded Segmentation
Zhihua Liu
Amrutha Saseendran
Lei Tong
Xilin He
Fariba Yousefi
...
Dino Oglic
Tom Diethe
Philip Teare
Huiyu Zhou
Chen Jin
VLM
52
0
0
23 May 2025
Accelerating Learned Image Compression Through Modeling Neural Training Dynamics
Accelerating Learned Image Compression Through Modeling Neural Training Dynamics
Yichi Zhang
Zhihao Duan
Yuning Huang
Fengqing Zhu
66
0
0
23 May 2025
One RL to See Them All: Visual Triple Unified Reinforcement Learning
One RL to See Them All: Visual Triple Unified Reinforcement Learning
Yan Ma
Linge Du
Xuyang Shen
Shaoxiang Chen
Pengfei Li
Qibing Ren
Lizhuang Ma
Yuchao Dai
Pengfei Liu
Junjie Yan
OffRL
LRM
28
0
0
23 May 2025
Towards more transferable adversarial attack in black-box manner
Chun Tong Lei
Zhongliang Guo
Hon Chung Lee
Minh Quoc Duong
Chun Pong Lau
DiffM
AAML
115
0
0
23 May 2025
Co-Reinforcement Learning for Unified Multimodal Understanding and Generation
Co-Reinforcement Learning for Unified Multimodal Understanding and Generation
Jingjing Jiang
Chongjie Si
Jun Luo
Hanwang Zhang
Chao Ma
69
0
0
23 May 2025
PoseBH: Prototypical Multi-Dataset Training Beyond Human Pose Estimation
U. Jeong
J. Freer
Seungryul Baek
H. Chang
K. Kim
3DH
42
0
0
23 May 2025
Creatively Upscaling Images with Global-Regional Priors
Creatively Upscaling Images with Global-Regional Priors
Yurui Qian
Qi Cai
Yingwei Pan
Ting Yao
Tao Mei
DiffM
59
0
0
22 May 2025
Panoptic Captioning: Seeking An Equivalency Bridge for Image and Text
Panoptic Captioning: Seeking An Equivalency Bridge for Image and Text
Kun-Yu Lin
Hongjun Wang
Weining Ren
Kai Han
81
0
0
22 May 2025
Multimodal Generative AI for Story Point Estimation in Software Development
Multimodal Generative AI for Story Point Estimation in Software Development
Mohammad Rubyet Islam
Peter Sandborn
45
0
0
22 May 2025
V2V: Scaling Event-Based Vision through Efficient Video-to-Voxel Simulation
V2V: Scaling Event-Based Vision through Efficient Video-to-Voxel Simulation
Hanyue Lou
Jinxiu Liang
Minggui Teng
Yi Wang
Boxin Shi
VGen
12
0
0
22 May 2025
SuperPure: Efficient Purification of Localized and Distributed Adversarial Patches via Super-Resolution GAN Models
SuperPure: Efficient Purification of Localized and Distributed Adversarial Patches via Super-Resolution GAN Models
Hossein Khalili
Seongbin Park
Venkat Bollapragada
Nader Sehatbakhsh
AAML
45
0
0
22 May 2025
Analyzing Fine-Grained Alignment and Enhancing Vision Understanding in Multimodal Language Models
Analyzing Fine-Grained Alignment and Enhancing Vision Understanding in Multimodal Language Models
Jiachen Jiang
Jinxin Zhou
Bo Peng
Xia Ning
Zhihui Zhu
17
0
0
22 May 2025
One-Step Diffusion-Based Image Compression with Semantic Distillation
One-Step Diffusion-Based Image Compression with Semantic Distillation
Naifu Xue
Zhaoyang Jia
Jiahao Li
Bin Li
Yuan Zhang
Yan Lu
DiffM
37
0
0
22 May 2025
Decoupled Geometric Parameterization and its Application in Deep Homography Estimation
Decoupled Geometric Parameterization and its Application in Deep Homography Estimation
Yao Huang
Si-Yuan Cao
Yaqing Ding
Hao Yin
Shibin Xie
...
Zhijun Fang
Jiachun Wang
Shen Cai
Junchi Yan
Shuhan Shen
16
0
0
22 May 2025
AnchorFormer: Differentiable Anchor Attention for Efficient Vision Transformer
AnchorFormer: Differentiable Anchor Attention for Efficient Vision Transformer
Jiquan Shan
Junxiao Wang
Lifeng Zhao
Liang Cai
Hongyuan Zhang
Ioannis Liritzis
ViT
48
0
0
22 May 2025
Highlighting What Matters: Promptable Embeddings for Attribute-Focused Image Retrieval
Highlighting What Matters: Promptable Embeddings for Attribute-Focused Image Retrieval
Siting Li
Xiang Gao
Simon Shaolei Du
27
0
0
21 May 2025
Prompt Tuning Vision Language Models with Margin Regularizer for Few-Shot Learning under Distribution Shifts
Prompt Tuning Vision Language Models with Margin Regularizer for Few-Shot Learning under Distribution Shifts
Debarshi Brahma
Anuska Roy
Soma Biswas
VLM
70
0
0
21 May 2025
MoRE-Brain: Routed Mixture of Experts for Interpretable and Generalizable Cross-Subject fMRI Visual Decoding
MoRE-Brain: Routed Mixture of Experts for Interpretable and Generalizable Cross-Subject fMRI Visual Decoding
Yuxiang Wei
Yanteng Zhang
Xi Xiao
Tianyang Wang
Xiao Wang
Vince D. Calhoun
MoE
48
0
0
21 May 2025
Domain Adaptation for Multi-label Image Classification: a Discriminator-free Approach
Domain Adaptation for Multi-label Image Classification: a Discriminator-free Approach
Inder Pal Singh
Enjie Ghorbel
Anis Kacem
Djamila Aouada
119
0
0
20 May 2025
AKRMap: Adaptive Kernel Regression for Trustworthy Visualization of Cross-Modal Embeddings
AKRMap: Adaptive Kernel Regression for Trustworthy Visualization of Cross-Modal Embeddings
Yilin Ye
Junchao Huang
Xingchen Zeng
Jiazhi Xia
Wei Zeng
68
0
0
20 May 2025
Decoupling Classifier for Boosting Few-shot Object Detection and Instance Segmentation
Decoupling Classifier for Boosting Few-shot Object Detection and Instance Segmentation
Bin-Bin Gao
Xiaochen Chen
Z. Huang
Congchong Nie
Jun Liu
Jinxiang Lai
Guannan Jiang
Xi-Zhao Wang
Chengjie Wang
60
28
0
20 May 2025
Dynadiff: Single-stage Decoding of Images from Continuously Evolving fMRI
Dynadiff: Single-stage Decoding of Images from Continuously Evolving fMRI
Marlène Careil
Yohann Benchetrit
Jean-Rémi King
46
0
0
20 May 2025
Handloom Design Generation Using Generative Networks
Handloom Design Generation Using Generative Networks
Rajat Kanti Bhattacharjee
Meghali Nandi
Amrit Jha
Gunajit Kalita
Ferdous Ahmed Barbhuiya
AI4CE
22
4
0
20 May 2025
Rethinking Features-Fused-Pyramid-Neck for Object Detection
Rethinking Features-Fused-Pyramid-Neck for Object Detection
Hulin Li
67
0
0
19 May 2025
Advancing Sequential Numerical Prediction in Autoregressive Models
Advancing Sequential Numerical Prediction in Autoregressive Models
Xiang Fei
Jinghui Lu
Qi Sun
Hao Feng
Yanjie Wang
Wei Shi
An-Lan Wang
Jingqun Tang
Can Huang
AI4TS
52
3
0
19 May 2025
CompBench: Benchmarking Complex Instruction-guided Image Editing
CompBench: Benchmarking Complex Instruction-guided Image Editing
Bohan Jia
Wenxuan Huang
Yuntian Tang
Junbo Qiao
Jincheng Liao
...
Lin Chen
Fei Zhao
Zihan Wang
Yuan Xie
Shaohui Lin
CoGe
56
1
0
18 May 2025
AoP-SAM: Automation of Prompts for Efficient Segmentation
AoP-SAM: Automation of Prompts for Efficient Segmentation
Yi Chen
Mu-Young Son
Chuanbo Hua
Joo-Young Kim
VLM
37
0
0
17 May 2025
SoftPQ: Robust Instance Segmentation Evaluation via Soft Matching and Tunable Thresholds
SoftPQ: Robust Instance Segmentation Evaluation via Soft Matching and Tunable Thresholds
Ranit Karmakar
Simon F. Nørrelykke
12
0
0
17 May 2025
VisionReasoner: Unified Visual Perception and Reasoning via Reinforcement Learning
VisionReasoner: Unified Visual Perception and Reasoning via Reinforcement Learning
Yuqi Liu
Tianyuan Qu
Zhisheng Zhong
Bohao Peng
Shu Liu
Bei Yu
Jiaya Jia
VLM
LRM
64
2
0
17 May 2025
Attend to Not Attended: Structure-then-Detail Token Merging for Post-training DiT Acceleration
Attend to Not Attended: Structure-then-Detail Token Merging for Post-training DiT Acceleration
Haipeng Fang
Sheng Tang
Juan Cao
Enshuo Zhang
Fan Tang
Tong-Yee Lee
31
0
0
16 May 2025
Object-Centric Representations Improve Policy Generalization in Robot Manipulation
Object-Centric Representations Improve Policy Generalization in Robot Manipulation
Alexandre Chapin
Bruno Machado
Emmanuel Dellandrea
Liming Chen
OCL
54
0
0
16 May 2025
TCC-Bench: Benchmarking the Traditional Chinese Culture Understanding Capabilities of MLLMs
TCC-Bench: Benchmarking the Traditional Chinese Culture Understanding Capabilities of MLLMs
Pengju Xu
Yan Wang
Shuyuan Zhang
Xuan Zhou
Xin Li
...
Fengzhao Li
Shuigeng Zhou
Xingyu Wang
Yi Zhang
Haiying Zhao
VLM
47
0
0
16 May 2025
MASSV: Multimodal Adaptation and Self-Data Distillation for Speculative Decoding of Vision-Language Models
MASSV: Multimodal Adaptation and Self-Data Distillation for Speculative Decoding of Vision-Language Models
Mugilan Ganesan
Siyang Song
Ankur Aggarwal
Nish Sinnadurai
Sean Lie
Vithursan Thangarasa
VLM
45
0
0
15 May 2025
Boosting Text-to-Chart Retrieval through Training with Synthesized Semantic Insights
Boosting Text-to-Chart Retrieval through Training with Synthesized Semantic Insights
Yifan Wu
Lutao Yan
Yizhang Zhu
Yinan Mei
Jiannan Wang
Nan Tang
Yuyu Luo
44
1
0
15 May 2025
ADHMR: Aligning Diffusion-based Human Mesh Recovery via Direct Preference Optimization
ADHMR: Aligning Diffusion-based Human Mesh Recovery via Direct Preference Optimization
Wenhao Shen
Wanqi Yin
Xiaofeng Yang
Cheng Chen
Chaoyue Song
Zhongang Cai
Lei Yang
Hao Wang
Guosheng Lin
71
0
0
15 May 2025
MetaUAS: Universal Anomaly Segmentation with One-Prompt Meta-Learning
MetaUAS: Universal Anomaly Segmentation with One-Prompt Meta-Learning
Bin-Bin Gao
VLM
50
0
0
14 May 2025
SPAST: Arbitrary Style Transfer with Style Priors via Pre-trained Large-scale Model
SPAST: Arbitrary Style Transfer with Style Priors via Pre-trained Large-scale Model
Zhanjie Zhang
Quanwei Zhang
Junsheng Luan
Mengyuan Yang
Yun Wang
Lei Zhao
57
1
0
13 May 2025
HCMA: Hierarchical Cross-model Alignment for Grounded Text-to-Image Generation
HCMA: Hierarchical Cross-model Alignment for Grounded Text-to-Image Generation
Hang Wang
Zhi-Qi Cheng
Chenhao Lin
Chao Shen
Lei Zhang
DiffM
46
0
0
10 May 2025
1234...121314
Next