ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.17645
  4. Cited By

HoloLLM: Multisensory Foundation Model for Language-Grounded Human Sensing and Reasoning

23 May 2025
Chuhao Zhou
Jianfei Yang
    VLM
ArXivPDFHTML

Papers citing "HoloLLM: Multisensory Foundation Model for Language-Grounded Human Sensing and Reasoning"

30 / 30 papers shown
Title
OneLLM: One Framework to Align All Modalities with Language
OneLLM: One Framework to Align All Modalities with Language
Jiaming Han
Kaixiong Gong
Yiyuan Zhang
Jiaqi Wang
Kaipeng Zhang
Dahua Lin
Yu Qiao
Peng Gao
Xiangyu Yue
MLLM
140
121
0
10 Jan 2025
X-Fi: A Modality-Invariant Foundation Model for Multimodal Human Sensing
X-Fi: A Modality-Invariant Foundation Model for Multimodal Human Sensing
Xinyan Chen
Jianfei Yang
67
2
0
14 Oct 2024
Video Instruction Tuning With Synthetic Data
Video Instruction Tuning With Synthetic Data
Yuanhan Zhang
Jinming Wu
Wei Li
Bo Li
Zejun Ma
Ziwei Liu
Chunyuan Li
SyDa
VGen
89
164
0
03 Oct 2024
TokenPacker: Efficient Visual Projector for Multimodal LLM
TokenPacker: Efficient Visual Projector for Multimodal LLM
Wentong Li
Yuqian Yuan
Jian Liu
Dongqi Tang
Song Wang
Jie Qin
Jianke Zhu
Lei Zhang
MLLM
55
57
0
02 Jul 2024
OpenVLA: An Open-Source Vision-Language-Action Model
OpenVLA: An Open-Source Vision-Language-Action Model
Moo Jin Kim
Karl Pertsch
Siddharth Karamcheti
Ted Xiao
Ashwin Balakrishna
...
Russ Tedrake
Dorsa Sadigh
Sergey Levine
Percy Liang
Chelsea Finn
LM&Ro
VLM
138
425
0
13 Jun 2024
LLMs are Good Action Recognizers
LLMs are Good Action Recognizers
Haoxuan Qu
Yujun Cai
Jun Liu
63
17
0
31 Mar 2024
LiveHPS: LiDAR-based Scene-level Human Pose and Shape Estimation in Free
  Environment
LiveHPS: LiDAR-based Scene-level Human Pose and Shape Estimation in Free Environment
Yiming Ren
Xiao Han
Chengfeng Zhao
Jingya Wang
Lan Xu
Jingyi Yu
Yuexin Ma
3DH
51
14
0
27 Feb 2024
Honeybee: Locality-enhanced Projector for Multimodal LLM
Honeybee: Locality-enhanced Projector for Multimodal LLM
Junbum Cha
Wooyoung Kang
Jonghwan Mun
Byungseok Roh
MLLM
53
124
0
11 Dec 2023
TENT: Connect Language Models with IoT Sensors for Zero-Shot Activity
  Recognition
TENT: Connect Language Models with IoT Sensors for Zero-Shot Activity Recognition
Yunjiao Zhou
Jianfei Yang
Han Zou
Lihua Xie
VLM
49
20
0
14 Nov 2023
PointLLM: Empowering Large Language Models to Understand Point Clouds
PointLLM: Empowering Large Language Models to Understand Point Clouds
Runsen Xu
Xiaolong Wang
Tai Wang
Yilun Chen
Jiangmiao Pang
Dahua Lin
MLLM
66
165
0
31 Aug 2023
Llama 2: Open Foundation and Fine-Tuned Chat Models
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron
Louis Martin
Kevin R. Stone
Peter Albert
Amjad Almahairi
...
Sharan Narang
Aurelien Rodriguez
Robert Stojnic
Sergey Edunov
Thomas Scialom
AI4MH
ALM
206
11,636
0
18 Jul 2023
Scalable 3D Captioning with Pretrained Models
Scalable 3D Captioning with Pretrained Models
Tiange Luo
C. Rockwell
Honglak Lee
Justin Johnson
42
156
0
12 Jun 2023
ChatBridge: Bridging Modalities with Large Language Model as a Language
  Catalyst
ChatBridge: Bridging Modalities with Large Language Model as a Language Catalyst
Zijia Zhao
Longteng Guo
Tongtian Yue
Si-Qing Chen
Shuai Shao
Xinxin Zhu
Zehuan Yuan
Jing Liu
MLLM
63
57
0
25 May 2023
MM-Fi: Multi-Modal Non-Intrusive 4D Human Dataset for Versatile Wireless
  Sensing
MM-Fi: Multi-Modal Non-Intrusive 4D Human Dataset for Versatile Wireless Sensing
Jianfei Yang
He Huang
Yunjiao Zhou
Xinyan Chen
Yuecong Xu
Shenghai Yuan
Han Zou
Chris Xiaoxuan Lu
Lihua Xie
59
49
0
12 May 2023
ImageBind: One Embedding Space To Bind Them All
ImageBind: One Embedding Space To Bind Them All
Rohit Girdhar
Alaaeldin El-Nouby
Zhuang Liu
Mannat Singh
Kalyan Vasudev Alwala
Armand Joulin
Ishan Misra
VLM
87
889
0
09 May 2023
Visual Instruction Tuning
Visual Instruction Tuning
Haotian Liu
Chunyuan Li
Qingyang Wu
Yong Jae Lee
SyDa
VLM
MLLM
343
4,607
0
17 Apr 2023
GPT-4 Technical Report
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAG
MLLM
631
13,788
0
15 Mar 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
385
4,465
0
30 Jan 2023
LAION-5B: An open large-scale dataset for training next generation
  image-text models
LAION-5B: An open large-scale dataset for training next generation image-text models
Christoph Schuhmann
Romain Beaumont
Richard Vencu
Cade Gordon
Ross Wightman
...
Srivatsa Kundurthy
Katherine Crowson
Ludwig Schmidt
R. Kaczmarczyk
J. Jitsev
VLM
MLLM
CLIP
125
3,355
0
16 Oct 2022
ImmFusion: Robust mmWave-RGB Fusion for 3D Human Body Reconstruction in
  All Weather Conditions
ImmFusion: Robust mmWave-RGB Fusion for 3D Human Body Reconstruction in All Weather Conditions
Anjun Chen
Xiangyu Wang
Kun Shi
Shaohao Zhu
Bin Fang
Yingke Chen
Jiming Chen
Yuchi Huo
Qi Ye
3DH
62
21
0
04 Oct 2022
SenseFi: A Library and Benchmark on Deep-Learning-Empowered WiFi Human
  Sensing
SenseFi: A Library and Benchmark on Deep-Learning-Empowered WiFi Human Sensing
Jianfei Yang
Xinyan Chen
Dazhuo Wang
Han Zou
Chris Xiaoxuan Lu
S. Sun
Lihua Xie
3DV
38
80
0
16 Jul 2022
HuMMan: Multi-Modal 4D Human Dataset for Versatile Sensing and Modeling
HuMMan: Multi-Modal 4D Human Dataset for Versatile Sensing and Modeling
Zhongang Cai
Daxuan Ren
Ailing Zeng
Zhengyu Lin
Tao Yu
...
Fangzhou Hong
Mingyuan Zhang
Chen Change Loy
Lei Yang
Ziwei Liu
3DH
73
103
0
28 Apr 2022
EfficientFi: Towards Large-Scale Lightweight WiFi Sensing via CSI
  Compression
EfficientFi: Towards Large-Scale Lightweight WiFi Sensing via CSI Compression
Jianfei Yang
Xinyan Chen
Han Zou
Dazhuo Wang
Q. Xu
Lihua Xie
38
82
0
08 Apr 2022
LiDARCap: Long-range Marker-less 3D Human Motion Capture with LiDAR
  Point Clouds
LiDARCap: Long-range Marker-less 3D Human Motion Capture with LiDAR Point Clouds
Jialian Li
Jingyi Zhang
Zhiyong Wang
Siqi Shen
Chenglu Wen
Yuexin Ma
Lan Xu
Jingyi Yu
Cheng-i Wang
3DPC
52
32
0
28 Mar 2022
Multi-modal 3D Human Pose Estimation with 2D Weak Supervision in
  Autonomous Driving
Multi-modal 3D Human Pose Estimation with 2D Weak Supervision in Autonomous Driving
Jingxiao Zheng
X. Shi
Alexander N. Gorban
Junhua Mao
Yang Song
...
Visesh Chari
Andre Cornman
Yin Zhou
Congcong Li
Drago Anguelov
3DH
41
46
0
22 Dec 2021
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Kristen Grauman
Andrew Westbury
Eugene Byrne
Zachary Chavis
Antonino Furnari
...
Mike Zheng Shou
Antonio Torralba
Lorenzo Torresani
Mingfei Yan
Jitendra Malik
EgoV
350
1,056
0
13 Oct 2021
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval
Max Bain
Arsha Nagrani
Gül Varol
Andrew Zisserman
VGen
117
1,154
0
01 Apr 2021
Learning Transferable Visual Models From Natural Language Supervision
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
681
28,659
0
26 Feb 2021
PointNet: Deep Learning on Point Sets for 3D Classification and
  Segmentation
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation
C. Qi
Hao Su
Kaichun Mo
Leonidas Guibas
3DH
3DPC
3DV
PINN
401
14,191
0
02 Dec 2016
Deep Residual Learning for Image Recognition
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
1.4K
192,638
0
10 Dec 2015
1