ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1405.0312
  4. Cited By
Microsoft COCO: Common Objects in Context

Microsoft COCO: Common Objects in Context

1 May 2014
Nayeon Lee
Michael Maire
Serge J. Belongie
Lubomir Bourdev
Ross B. Girshick
James Hays
Pietro Perona
Deva Ramanan
C. L. Zitnick
Piotr Dollár
    ObjD
ArXivPDFHTML

Papers citing "Microsoft COCO: Common Objects in Context"

50 / 652 papers shown
Title
UncertainSAM: Fast and Efficient Uncertainty Quantification of the Segment Anything Model
UncertainSAM: Fast and Efficient Uncertainty Quantification of the Segment Anything Model
Timo Kaiser
Thomas Norrenbrock
Bodo Rosenhahn
65
0
0
08 May 2025
MDE-Edit: Masked Dual-Editing for Multi-Object Image Editing via Diffusion Models
MDE-Edit: Masked Dual-Editing for Multi-Object Image Editing via Diffusion Models
Hongyang Zhu
Haipeng Liu
Bo Fu
Yang Wang
DiffM
68
0
0
08 May 2025
FG-CLIP: Fine-Grained Visual and Textual Alignment
FG-CLIP: Fine-Grained Visual and Textual Alignment
Chunyu Xie
Bin Wang
Fanjing Kong
Jincheng Li
Dawei Liang
Gengshen Zhang
Dawei Leng
Yuhui Yin
CLIP
VLM
89
0
0
08 May 2025
A Weak Supervision Learning Approach Towards an Equitable Mobility Estimation
A Weak Supervision Learning Approach Towards an Equitable Mobility Estimation
Theophilus Aidoo
Till Koebe
Akansh Maurya
Hewan Shrestha
Ingmar Weber
93
0
0
07 May 2025
CRAFT: Cultural Russian-Oriented Dataset Adaptation for Focused Text-to-Image Generation
CRAFT: Cultural Russian-Oriented Dataset Adaptation for Focused Text-to-Image Generation
Viacheslav Vasilev
V. Arkhipkin
Julia Agafonova
Tatiana Nikulina
Evelina Mironova
Alisa Shichanina
Nikolai Gerasimenko
Mikhail Shoytov
Denis Dimitrov
65
0
0
07 May 2025
LiftFeat: 3D Geometry-Aware Local Feature Matching
LiftFeat: 3D Geometry-Aware Local Feature Matching
Yepeng Liu
Wenpeng Lai
Zhou Zhao
Yuxuan Xiong
Jinchi Zhu
Jun Cheng
Yongchao Xu
73
0
0
06 May 2025
RGBX-DiffusionDet: A Framework for Multi-Modal RGB-X Object Detection Using DiffusionDet
RGBX-DiffusionDet: A Framework for Multi-Modal RGB-X Object Detection Using DiffusionDet
Eliraz Orfaig
Inna Stainvas
Igal Bilik
41
0
0
05 May 2025
ReplaceMe: Network Simplification via Layer Pruning and Linear Transformations
ReplaceMe: Network Simplification via Layer Pruning and Linear Transformations
Dmitriy Shopkhoev
Ammar Ali
Magauiya Zhussip
Valentin Malykh
Stamatios Lefkimmiatis
N. Komodakis
Sergey Zagoruyko
VLM
324
0
0
05 May 2025
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Wei Wei
Jintao Guo
Shanshan Zhao
Minghao Fu
Lunhao Duan
Guo-Hua Wang
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
DiffM
127
0
0
05 May 2025
Multi-Modal Language Models as Text-to-Image Model Evaluators
Multi-Modal Language Models as Text-to-Image Model Evaluators
Jiahui Chen
Candace Ross
Reyhane Askari Hemmat
Koustuv Sinha
Melissa Hall
M. Drozdzal
Adriana Romero-Soriano
EGVM
71
0
0
01 May 2025
MolMole: Molecule Mining from Scientific Literature
MolMole: Molecule Mining from Scientific Literature
LG AI Research
S. Chun
Jiye Kim
Ahra Jo
Yeonsik Jo
...
Sehui Han
Jaewan Lee
Changyoung Park
Kijeong Jeon
Sihyuk Yi
29
0
0
30 Apr 2025
Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models
Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models
Xu Ma
Peize Sun
Haoyu Ma
Hao Tang
Chih-Yao Ma
...
Matt Feiszli
Peizhao Zhang
Peter Vajda
Sam S. Tsai
Y. Fu
91
2
0
24 Apr 2025
DyMU: Dynamic Merging and Virtual Unmerging for Efficient VLMs
DyMU: Dynamic Merging and Virtual Unmerging for Efficient VLMs
Zehao Wang
Senthil Purushwalkam
Caiming Xiong
Siyang Song
Chenhui Xu
Ran Xu
90
2
0
23 Apr 2025
Progressive Language-guided Visual Learning for Multi-Task Visual Grounding
Progressive Language-guided Visual Learning for Multi-Task Visual Grounding
Jingchao Wang
Hong Wang
Wenlong Zhang
Kunhua Ji
Dingjiang Huang
Yefeng Zheng
ObjD
65
0
0
22 Apr 2025
DINOv2-powered Few-Shot Semantic Segmentation: A Unified Framework via Cross-Model Distillation and 4D Correlation Mining
DINOv2-powered Few-Shot Semantic Segmentation: A Unified Framework via Cross-Model Distillation and 4D Correlation Mining
Wei Zhuo
Zhiyue Tang
Wufeng Xue
Hao Ding
Linlin Shen
45
0
0
22 Apr 2025
Vision-Language Models Are Not Pragmatically Competent in Referring Expression Generation
Vision-Language Models Are Not Pragmatically Competent in Referring Expression Generation
Ziqiao Ma
Jing Ding
Xuejun Zhang
Dezhi Luo
Jiahe Ding
Sihan Xu
Yuchen Huang
Run Peng
Joyce Chai
105
0
0
22 Apr 2025
U-Shape Mamba: State Space Model for faster diffusion
U-Shape Mamba: State Space Model for faster diffusion
Alex Ergasti
Filippo Botti
Tomaso Fontanini
Claudio Ferrari
Massimo Bertozzi
Andrea Prati
Mamba
106
1
0
18 Apr 2025
LimitNet: Progressive, Content-Aware Image Offloading for Extremely Weak Devices & Networks
LimitNet: Progressive, Content-Aware Image Offloading for Extremely Weak Devices & Networks
A. Hojjat
Janek Haberer
Tayyaba Zainab
Olaf Landsiedel
52
3
0
18 Apr 2025
Compile Scene Graphs with Reinforcement Learning
Compile Scene Graphs with Reinforcement Learning
Zuyao Chen
Jinlin Wu
Zhen Lei
Marc Pollefeys
Chang Wen Chen
OffRL
LRM
71
2
0
18 Apr 2025
Mask Image Watermarking
Mask Image Watermarking
Runyi Hu
Jie Zhang
Shiqian Zhao
Nils Lukas
Jiwei Li
Qing Guo
Han Qiu
Tianwei Zhang
59
0
0
17 Apr 2025
Perception Encoder: The best visual embeddings are not at the output of the network
Perception Encoder: The best visual embeddings are not at the output of the network
Daniel Bolya
Po-Yao (Bernie) Huang
Peize Sun
Jang Hyun Cho
Andrea Madotto
...
Shiyu Dong
Nikhila Ravi
Daniel Li
Piotr Dollár
Christoph Feichtenhofer
ObjD
VOS
150
5
0
17 Apr 2025
Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception
Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception
Ziqi Pang
Xin Xu
Yu-Xiong Wang
DiffM
104
0
0
15 Apr 2025
PT-Mark: Invisible Watermarking for Text-to-image Diffusion Models via Semantic-aware Pivotal Tuning
PT-Mark: Invisible Watermarking for Text-to-image Diffusion Models via Semantic-aware Pivotal Tuning
Yansen Wang
Huiyu Xu
Peng Kuang
Jiacheng Du
Zehan Li
Yiming Li
Qiu Wang
Kui Ren
WIGM
90
0
0
15 Apr 2025
WaterFlow: Learning Fast & Robust Watermarks using Stable Diffusion
WaterFlow: Learning Fast & Robust Watermarks using Stable Diffusion
Vinay Shukla
Prachee Sharma
Ryan Rossi
Sungchul Kim
Tong Yu
Aditya Grover
WIGM
101
0
0
15 Apr 2025
The Mirage of Performance Gains: Why Contrastive Decoding Fails to Address Multimodal Hallucination
The Mirage of Performance Gains: Why Contrastive Decoding Fails to Address Multimodal Hallucination
Hao Yin
Gunagzong Si
Zilei Wang
351
0
0
14 Apr 2025
Probability Distribution Alignment and Low-Rank Weight Decomposition for Source-Free Domain Adaptive Brain Decoding
Probability Distribution Alignment and Low-Rank Weight Decomposition for Source-Free Domain Adaptive Brain Decoding
Ganxi Xu
Jinyi Long
Hanrui Wu
73
0
0
12 Apr 2025
Evolved Hierarchical Masking for Self-Supervised Learning
Evolved Hierarchical Masking for Self-Supervised Learning
Zhanzhou Feng
Shiliang Zhang
75
0
0
12 Apr 2025
Generating Fine Details of Entity Interactions
Generating Fine Details of Entity Interactions
Xinyi Gu
Jiayuan Mao
89
0
0
11 Apr 2025
Pychop: Emulating Low-Precision Arithmetic in Numerical Methods and Neural Networks
Pychop: Emulating Low-Precision Arithmetic in Numerical Methods and Neural Networks
Erin Carson
Xinye Chen
80
0
0
10 Apr 2025
AD-Det: Boosting Object Detection in UAV Images with Focused Small Objects and Balanced Tail Classes
AD-Det: Boosting Object Detection in UAV Images with Focused Small Objects and Balanced Tail Classes
Zhenteng Li
Sheng Lian
Dengfeng Pan
Yijiao Wang
Wei Liu
86
0
0
08 Apr 2025
PromptHMR: Promptable Human Mesh Recovery
PromptHMR: Promptable Human Mesh Recovery
Yufu Wang
Yu Sun
Priyanka Patel
Kostas Daniilidis
Michael J. Black
Muhammed Kocabas
3DH
103
0
0
08 Apr 2025
D-Feat Occlusions: Diffusion Features for Robustness to Partial Visual Occlusions in Object Recognition
D-Feat Occlusions: Diffusion Features for Robustness to Partial Visual Occlusions in Object Recognition
Rupayan Mallick
Sibo Dong
Nataniel Ruiz
Sarah Adel Bargal
DiffM
101
0
0
08 Apr 2025
EffOWT: Transfer Visual Language Models to Open-World Tracking Efficiently and Effectively
EffOWT: Transfer Visual Language Models to Open-World Tracking Efficiently and Effectively
Bingyang Wang
Kaer Huang
Bin Li
Yiqiang Yan
Lulu Zhang
Huchuan Lu
You He
VLM
60
0
0
07 Apr 2025
A High-Force Gripper with Embedded Multimodal Sensing for Powerful and Perception Driven Grasping
A High-Force Gripper with Embedded Multimodal Sensing for Powerful and Perception Driven Grasping
Edoardo Del Bianco
Davide Torielli
Federico Rollo
Damiano Gasperini
Arturo Laurenzi
Lorenzo Baccelliere
L. Muratore
Marco Roveri
Nikos Tsagarakis
41
2
0
07 Apr 2025
Directional Sign Loss: A Topology-Preserving Loss Function that Approximates the Sign of Finite Differences
Directional Sign Loss: A Topology-Preserving Loss Function that Approximates the Sign of Finite Differences
Harvey Dam
Tripti Agarwal
Ganesh Gopalakrishnan
53
0
0
05 Apr 2025
HGFormer: Topology-Aware Vision Transformer with HyperGraph Learning
HGFormer: Topology-Aware Vision Transformer with HyperGraph Learning
Hao Wang
Shuo Zhang
Biao Leng
ViT
155
1
0
03 Apr 2025
Rip Current Segmentation: A Novel Benchmark and YOLOv8 Baseline Results
Rip Current Segmentation: A Novel Benchmark and YOLOv8 Baseline Results
Andrei Dumitriu
Florin Tatui
Florin Miron
Radu Tudor Ionescu
Radu Timofte
98
23
0
03 Apr 2025
BOP Challenge 2024 on Model-Based and Model-Free 6D Object Pose Estimation
BOP Challenge 2024 on Model-Based and Model-Free 6D Object Pose Estimation
Van Nguyen Nguyen
Stephen Tyree
Andrew Guo
Mederic Fourmy
Anas Gouda
...
Stan Birchfield
Jiri Matas
Yann Labbé
M. Sundermeyer
Tomás Hodan
3DPC
67
1
0
03 Apr 2025
UniViTAR: Unified Vision Transformer with Native Resolution
UniViTAR: Unified Vision Transformer with Native Resolution
Limeng Qiao
Yiyang Gan
Bairui Wang
Jie Qin
Shuang Xu
Siqi Yang
Lin Ma
80
0
0
02 Apr 2025
RipVIS: Rip Currents Video Instance Segmentation Benchmark for Beach Monitoring and Safety
RipVIS: Rip Currents Video Instance Segmentation Benchmark for Beach Monitoring and Safety
Andrei Dumitriu
Florin Tatui
Florin Miron
Aakash Ralhan
Radu Tudor Ionescu
Radu Timofte
60
0
0
01 Apr 2025
CBIL: Collective Behavior Imitation Learning for Fish from Real Videos
CBIL: Collective Behavior Imitation Learning for Fish from Real Videos
Yifan Wu
Zhiyang Dou
Yuko Ishiwaka
Shun Ogawa
Yuke Lou
Wenping Wang
Lingjie Liu
Taku Komura
114
3
0
31 Mar 2025
A GAN-Enhanced Deep Learning Framework for Rooftop Detection from Historical Aerial Imagery
A GAN-Enhanced Deep Learning Framework for Rooftop Detection from Historical Aerial Imagery
Pengyu Chen
Sicheng Wang
Cuizhen Wang
Senrong Wang
Beiao Huang
Lu Huang
Zhe Zang
54
0
0
29 Mar 2025
Breaking Language Barriers in Visual Language Models via Multilingual Textual Regularization
Breaking Language Barriers in Visual Language Models via Multilingual Textual Regularization
Iñigo Pikabea
Iñaki Lacunza
Oriol Pareras
Carlos Escolano
Aitor Gonzalez-Agirre
Javier Hernando
Marta Villegas
VLM
105
0
0
28 Mar 2025
VisTa: Visual-contextual and Text-augmented Zero-shot Object-level OOD Detection
VisTa: Visual-contextual and Text-augmented Zero-shot Object-level OOD Detection
Bin Zhang
Xiaoyang Qu
Guokuan Li
Jiguang Wan
Jianzong Wang
VLM
68
0
0
28 Mar 2025
ORIGEN: Zero-Shot 3D Orientation Grounding in Text-to-Image Generation
ORIGEN: Zero-Shot 3D Orientation Grounding in Text-to-Image Generation
Yunhong Min
Daehyeon Choi
Kyeongmin Yeo
Jihyun Lee
Minhyuk Sung
72
0
0
28 Mar 2025
CTRL-O: Language-Controllable Object-Centric Visual Representation Learning
CTRL-O: Language-Controllable Object-Centric Visual Representation Learning
Aniket Didolkar
Andrii Zadaianchuk
Rabiul Awal
Maximilian Seitzer
E. Gavves
Aishwarya Agrawal
OCL
VLM
128
3
0
27 Mar 2025
vGamba: Attentive State Space Bottleneck for efficient Long-range Dependencies in Visual Recognition
vGamba: Attentive State Space Bottleneck for efficient Long-range Dependencies in Visual Recognition
Yunusa Haruna
A. Lawan
Mamba
73
0
0
27 Mar 2025
Pluggable Style Representation Learning for Multi-Style Transfer
Pluggable Style Representation Learning for Multi-Style Transfer
Hongda Liu
Longguang Wang
Weijun Guan
Ye Zhang
Yulan Guo
100
1
0
26 Mar 2025
Unconditional Priors Matter! Improving Conditional Generation of Fine-Tuned Diffusion Models
Unconditional Priors Matter! Improving Conditional Generation of Fine-Tuned Diffusion Models
Prin Phunyaphibarn
Phillip Y. Lee
Jaihoon Kim
Minhyuk Sung
DiffM
102
0
0
26 Mar 2025
Fine-grained Textual Inversion Network for Zero-Shot Composed Image Retrieval
Fine-grained Textual Inversion Network for Zero-Shot Composed Image Retrieval
Haoqiang Lin
Haokun Wen
Xuemeng Song
Meng Liu
Yupeng Hu
Liqiang Nie
99
15
0
25 Mar 2025
Previous
12345...121314
Next