ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.02999
  4. Cited By
Vote2Cap-DETR++: Decoupling Localization and Describing for End-to-End
  3D Dense Captioning

Vote2Cap-DETR++: Decoupling Localization and Describing for End-to-End 3D Dense Captioning

6 September 2023
Sijin Chen
Hongyuan Zhu
Mingsheng Li
Xin Chen
Peng Guo
Yinjie Lei
Gang Yu
Taihao Li
Tao Chen
ArXivPDFHTML

Papers citing "Vote2Cap-DETR++: Decoupling Localization and Describing for End-to-End 3D Dense Captioning"

9 / 9 papers shown
Title
SpatialPrompting: Keyframe-driven Zero-Shot Spatial Reasoning with Off-the-Shelf Multimodal Large Language Models
SpatialPrompting: Keyframe-driven Zero-Shot Spatial Reasoning with Off-the-Shelf Multimodal Large Language Models
Shun Taguchi
Hideki Deguchi
Takumi Hamazaki
Hiroyuki Sakai
ReLM
LRM
49
0
0
08 May 2025
Robin3D: Improving 3D Large Language Model via Robust Instruction Tuning
Robin3D: Improving 3D Large Language Model via Robust Instruction Tuning
Weitai Kang
Haifeng Huang
Yuzhang Shang
Mubarak Shah
Yan Yan
46
7
0
21 Feb 2025
OrionNav: Online Planning for Robot Autonomy with Context-Aware LLM and
  Open-Vocabulary Semantic Scene Graphs
OrionNav: Online Planning for Robot Autonomy with Context-Aware LLM and Open-Vocabulary Semantic Scene Graphs
Venkata Naren Devarakonda
Raktim Gautam Goswami
Ali Umut Kaypak
Naman Patel
Rooholla Khorrambakht
Prashanth Krishnamurthy
Farshad Khorrami
LM&Ro
39
3
0
08 Oct 2024
LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding,
  Reasoning, and Planning
LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning
Sijin Chen
Xin Chen
C. Zhang
Mingsheng Li
Gang Yu
Hao Fei
Hongyuan Zhu
Jiayuan Fan
Tao Chen
MLLM
26
79
0
30 Nov 2023
Lite DETR : An Interleaved Multi-Scale Encoder for Efficient DETR
Lite DETR : An Interleaved Multi-Scale Encoder for Efficient DETR
Feng Li
Ailing Zeng
Siyi Liu
Hao Zhang
Hongyang Li
Lei Zhang
L. Ni
ViT
36
67
0
13 Mar 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
272
4,244
0
30 Jan 2023
Contextual Modeling for 3D Dense Captioning on Point Clouds
Contextual Modeling for 3D Dense Captioning on Point Clouds
Yufeng Zhong
Longdao Xu
Jiebo Luo
Lin Ma
44
15
0
08 Oct 2022
EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual
  Grounding
EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual Grounding
Yanmin Wu
Xinhua Cheng
Renrui Zhang
Zesen Cheng
Jian Zhang
53
63
0
29 Sep 2022
ENet: A Deep Neural Network Architecture for Real-Time Semantic
  Segmentation
ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation
Adam Paszke
Abhishek Chaurasia
Sangpil Kim
Eugenio Culurciello
SSeg
233
2,056
0
07 Jun 2016
1