ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.15116
  4. Cited By
Large Multimodal Agents: A Survey

Large Multimodal Agents: A Survey

23 February 2024
Junlin Xie
Zhihong Chen
Ruifei Zhang
Xiang Wan
Guanbin Li
    LM&Ro
    LLMAG
ArXivPDFHTML

Papers citing "Large Multimodal Agents: A Survey"

11 / 11 papers shown
Title
DriveAgent: Multi-Agent Structured Reasoning with LLM and Multimodal Sensor Fusion for Autonomous Driving
DriveAgent: Multi-Agent Structured Reasoning with LLM and Multimodal Sensor Fusion for Autonomous Driving
Xinmeng Hou
Wuqi Wang
Long Yang
Hao Lin
Jinglun Feng
Haigen Min
Xiangmo Zhao
21
0
0
04 May 2025
Enhancing Collective Intelligence in Large Language Models Through Emotional Integration
Likith Kadiyala
Ramteja Sajja
Y. Sermet
Ibrahim Demir
40
0
0
05 Mar 2025
Talk2Radar: Bridging Natural Language with 4D mmWave Radar for 3D Referring Expression Comprehension
Talk2Radar: Bridging Natural Language with 4D mmWave Radar for 3D Referring Expression Comprehension
Runwei Guan
Ruixiao Zhang
Ningwei Ouyang
Jianan Liu
Ka Lok Man
...
Ming Xu
Jeremy S. Smith
Eng Gee Lim
Yutao Yue
Hui Xiong
38
8
0
21 May 2024
OS-Copilot: Towards Generalist Computer Agents with Self-Improvement
OS-Copilot: Towards Generalist Computer Agents with Self-Improvement
Zhiyong Wu
Chengcheng Han
Zichen Ding
Zhenmin Weng
Zhoumianze Liu
Shunyu Yao
Tao Yu
Lingpeng Kong
LLMAG
LM&Ro
107
29
0
12 Feb 2024
Towards Urban General Intelligence: A Review and Outlook of Urban Foundation Models
Towards Urban General Intelligence: A Review and Outlook of Urban Foundation Models
Weijiao Zhang
Jindong Han
Zhao Xu
Hang Ni
Hao Liu
Hui Xiong
Hui Xiong
AI4CE
77
14
0
30 Jan 2024
WebWISE: Web Interface Control and Sequential Exploration with Large
  Language Models
WebWISE: Web Interface Control and Sequential Exploration with Large Language Models
Heyi Tao
TV Sethuraman
Michal Shlapentokh-Rothman
Derek Hoiem
LLMAG
37
4
0
24 Oct 2023
Loop Copilot: Conducting AI Ensembles for Music Generation and Iterative
  Editing
Loop Copilot: Conducting AI Ensembles for Music Generation and Iterative Editing
Yixiao Zhang
Akira Maezawa
Gus Xia
Kazuhiko Yamamoto
Simon Dixon
39
15
0
19 Oct 2023
Bootstrap Your Own Skills: Learning to Solve New Tasks with Large
  Language Model Guidance
Bootstrap Your Own Skills: Learning to Solve New Tasks with Large Language Model Guidance
Jesse Zhang
Jiahui Zhang
Karl Pertsch
Ziyi Liu
Xiang Ren
Minsuk Chang
Shao-Hua Sun
Joseph J. Lim
LLMAG
LM&Ro
57
31
0
16 Oct 2023
ChatVideo: A Tracklet-centric Multimodal and Versatile Video
  Understanding System
ChatVideo: A Tracklet-centric Multimodal and Versatile Video Understanding System
Junke Wang
Dongdong Chen
Chong Luo
Xiyang Dai
Lu Yuan
Zuxuan Wu
Yu-Gang Jiang
87
54
0
27 Apr 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
380
4,010
0
28 Jan 2022
1