ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.01413
  4. Cited By
MiniGPT-3D: Efficiently Aligning 3D Point Clouds with Large Language
  Models using 2D Priors

MiniGPT-3D: Efficiently Aligning 3D Point Clouds with Large Language Models using 2D Priors

ACM Multimedia (MM), 2024
2 May 2024
Yuan Tang
Xu Han
Xianzhi Li
Qiao Yu
Yixue Hao
Long Hu
Min Chen
ArXiv (abs)PDFHTMLGithub (99★)

Papers citing "MiniGPT-3D: Efficiently Aligning 3D Point Clouds with Large Language Models using 2D Priors"

20 / 20 papers shown
ArtiWorld: LLM-Driven Articulation of 3D Objects in Scenes
ArtiWorld: LLM-Driven Articulation of 3D Objects in Scenes
Yixuan Yang
Luyang Xie
Zhen Luo
Zixiang Zhao
Tongsheng Ding
Mingqi Gao
Feng Zheng
257
1
0
17 Nov 2025
AutoHood3D: A Multi-Modal Benchmark for Automotive Hood Design and Fluid-Structure Interaction
AutoHood3D: A Multi-Modal Benchmark for Automotive Hood Design and Fluid-Structure Interaction
Vansh Sharma
Harish Jai Ganesh
Maryam Akram
Wanjiao Liu
Venkat Raman
AI4CE
133
0
0
05 Nov 2025
3D Aware Region Prompted Vision Language Model
3D Aware Region Prompted Vision Language Model
A. Cheng
Yang Fu
Yukang Chen
Zhijian Liu
X. Li
...
Jan Kautz
Pavlo Molchanov
Hongxu Yin
Xiaolong Wang
Sifei Liu
169
20
0
16 Sep 2025
Point Linguist Model: Segment Any Object via Bridged Large 3D-Language Model
Point Linguist Model: Segment Any Object via Bridged Large 3D-Language Model
Zhuoxu Huang
Mingqi Gao
Jungong Han
199
2
0
09 Sep 2025
TinyGiantVLM: A Lightweight Vision-Language Architecture for Spatial Reasoning under Resource Constraints
TinyGiantVLM: A Lightweight Vision-Language Architecture for Spatial Reasoning under Resource Constraints
Vinh-Thuan Ly
Hoang M. Truong
Xuan-Huong Nguyen
LRM
102
0
0
25 Aug 2025
3D-R1: Enhancing Reasoning in 3D VLMs for Unified Scene Understanding
3D-R1: Enhancing Reasoning in 3D VLMs for Unified Scene Understanding
Ting Huang
Zeyu Zhang
Hao Tang
LRM
280
23
0
31 Jul 2025
BANG: Dividing 3D Assets via Generative Exploded Dynamics
BANG: Dividing 3D Assets via Generative Exploded DynamicsACM Transactions on Graphics (TOG), 2025
Longwen Zhang
Qixuan Zhang
Haoran Jiang
Yinuo Bai
Wei Yang
Lan Xu
Jingyi Yu
251
21
0
29 Jul 2025
Towards Scalable Spatial Intelligence via 2D-to-3D Data Lifting
Towards Scalable Spatial Intelligence via 2D-to-3D Data Lifting
Xingyu Miao
Haoran Duan
Quanhao Qian
Jiuniu Wang
Yang Long
Ling Shao
Deli Zhao
Ran Xu
Gongjie Zhang
312
6
0
24 Jul 2025
Does Your 3D Encoder Really Work? When Pretrain-SFT from 2D VLMs Meets 3D VLMs
Does Your 3D Encoder Really Work? When Pretrain-SFT from 2D VLMs Meets 3D VLMs
Haoyuan Li
Yanpeng Zhou
Yufei Gao
Tao Tang
J. N. Han
Yujie Yuan
Dave Zhenyu Chen
Jiawang Bian
Hang Xu
Xiaodan Liang
429
7
0
05 Jun 2025
Generating 6DoF Object Manipulation Trajectories from Action Description in Egocentric Vision
Generating 6DoF Object Manipulation Trajectories from Action Description in Egocentric VisionComputer Vision and Pattern Recognition (CVPR), 2025
Tomoya Yoshida
Shuhei Kurita
Taichi Nishimura
Shinsuke Mori
394
4
0
04 Jun 2025
OneLLM: One Framework to Align All Modalities with Language
OneLLM: One Framework to Align All Modalities with LanguageComputer Vision and Pattern Recognition (CVPR), 2023
Jiaming Han
Kaixiong Gong
Yiyuan Zhang
Yuan Liu
Kaipeng Zhang
Dahua Lin
Yu Qiao
Shiyang Feng
Xiangyu Yue
MLLM
711
230
0
10 Jan 2025
LSceneLLM: Enhancing Large 3D Scene Understanding Using Adaptive Visual Preferences
LSceneLLM: Enhancing Large 3D Scene Understanding Using Adaptive Visual PreferencesComputer Vision and Pattern Recognition (CVPR), 2024
Hongyan Zhi
Peihao Chen
Junyan Li
Shuailei Ma
Xinyu Sun
Tianhang Xiang
Yinjie Lei
Mingkui Tan
Chuang Gan
531
36
0
02 Dec 2024
Scene Co-pilot: Procedural Text to Video Generation with Human in the
  Loop
Scene Co-pilot: Procedural Text to Video Generation with Human in the Loop
Zhaofang Qian
Abolfazl Sharifi
Tucker Carroll
Ser-Nam Lim
VGen
351
0
0
26 Nov 2024
MICAS: Multi-grained In-Context Adaptive Sampling for 3D Point Cloud
  Processing
MICAS: Multi-grained In-Context Adaptive Sampling for 3D Point Cloud ProcessingComputer Vision and Pattern Recognition (CVPR), 2024
Feifei Shao
Ping Liu
Zhao Wang
Yawei Luo
Hongwei Wang
Jun Xiao
3DPC
372
4
0
25 Nov 2024
More Text, Less Point: Towards 3D Data-Efficient Point-Language Understanding
More Text, Less Point: Towards 3D Data-Efficient Point-Language UnderstandingAAAI Conference on Artificial Intelligence (AAAI), 2024
Yuan Tang
Xu Han
Xianzhi Li
Qiao Yu
Jinfeng Xu
Yixue Hao
Long Hu
Min Chen
586
6
0
28 Aug 2024
Foundation Models for Autonomous Robots in Unstructured Environments
Foundation Models for Autonomous Robots in Unstructured Environments
Hossein Naderi
Alireza Shojaei
Lifu Huang
LM&Ro
430
8
0
19 Jul 2024
Duoduo CLIP: Efficient 3D Understanding with Multi-View Images
Duoduo CLIP: Efficient 3D Understanding with Multi-View Images
Han-Hung Lee
Yiming Zhang
Angel X. Chang
3DPC
751
7
0
17 Jun 2024
A Survey on Text-guided 3D Visual Grounding: Elements, Recent Advances,
  and Future Directions
A Survey on Text-guided 3D Visual Grounding: Elements, Recent Advances, and Future DirectionsIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2024
Daizong Liu
Yang Liu
Wencan Huang
Wei Hu
LM&Ro
434
36
0
09 Jun 2024
Reason3D: Searching and Reasoning 3D Segmentation via Large Language Model
Reason3D: Searching and Reasoning 3D Segmentation via Large Language Model
Kuan-Chih Huang
Xiangtai Li
Lu Qi
Shuicheng Yan
Ming-Hsuan Yang
LRM
476
25
0
27 May 2024
Baichuan 2: Open Large-scale Language Models
Baichuan 2: Open Large-scale Language Models
Ai Ming Yang
Bin Xiao
Bingning Wang
Borong Zhang
Ce Bian
...
Youxin Jiang
Yuchen Gao
Yupeng Zhang
Guosheng Dong
Zhiying Wu
ELMLRM
1.0K
966
0
19 Sep 2023
1
Page 1 of 1