Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2309.05519
Cited By
NExT-GPT: Any-to-Any Multimodal LLM
11 September 2023
Shengqiong Wu
Hao Fei
Leigang Qu
Wei Ji
Tat-Seng Chua
MLLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"NExT-GPT: Any-to-Any Multimodal LLM"
50 / 336 papers shown
Title
Learning to Generate Explainable Stock Predictions using Self-Reflective Large Language Models
Kelvin J.L. Koa
Yunshan Ma
Ritchie Ng
Tat-Seng Chua
AIFin
LLMAG
31
25
0
06 Feb 2024
User Intent Recognition and Satisfaction with Large Language Models: A User Study with ChatGPT
Anna Bodonhelyi
Efe Bozkir
Shuo Yang
Enkelejda Kasneci
Gjergji Kasneci
ELM
AI4MH
23
13
0
03 Feb 2024
Bringing Generative AI to Adaptive Learning in Education
Hang Li
Tianlong Xu
Chaoli Zhang
Eason Chen
Jing Liang
Xing Fan
Haoyang Li
Jiliang Tang
Qingsong Wen
34
20
0
02 Feb 2024
In-Context Learning for Few-Shot Nested Named Entity Recognition
Meishan Zhang
Bin Wang
Hao Fei
Min Zhang
NAI
33
3
0
02 Feb 2024
SwarmBrain: Embodied agent for real-time strategy game StarCraft II via large language models
Xiao Shao
Weifu Jiang
Fei Zuo
Mengqing Liu
LLMAG
23
6
0
31 Jan 2024
Image Anything: Towards Reasoning-coherent and Training-free Multi-modal Image Generation
Yuanhuiyi Lyu
Xueye Zheng
Lin Wang
DiffM
17
9
0
31 Jan 2024
A Survey on Generative AI and LLM for Video Generation, Understanding, and Streaming
Pengyuan Zhou
Lin Wang
Zhi Liu
Yanbin Hao
Pan Hui
Sasu Tarkoma
J. Kangasharju
VGen
26
24
0
30 Jan 2024
Towards Urban General Intelligence: A Review and Outlook of Urban Foundation Models
Weijiao Zhang
Jindong Han
Zhao Xu
Hang Ni
Hao Liu
Hui Xiong
Hui Xiong
AI4CE
77
14
0
30 Jan 2024
ChatGraph: Chat with Your Graphs
Yun Peng
Sen Lin
Qian Chen
Lyu Xu
Xiaojun Ren
Yafei Li
Jianliang Xu
22
1
0
23 Jan 2024
Detecting Multimedia Generated by Large AI Models: A Survey
Li Lin
Neeraj Gupta
Yue Zhang
Hainan Ren
Chun-Hao Liu
Feng Ding
Xin Eric Wang
X. Li
Luisa Verdoliva
Shu Hu
75
53
0
22 Jan 2024
MLLM-Tool: A Multimodal Large Language Model For Tool Agent Learning
Chenyu Wang
Weixin Luo
Qianyu Chen
Haonan Mai
Jindi Guo
Sixun Dong
Xiaohua Xuan
MLLM
LLMAG
39
17
0
19 Jan 2024
MM-Interleaved: Interleaved Image-Text Generative Modeling via Multi-modal Feature Synchronizer
Changyao Tian
Xizhou Zhu
Yuwen Xiong
Weiyun Wang
Zhe Chen
...
Tong Lu
Jie Zhou
Hongsheng Li
Yu Qiao
Jifeng Dai
AuLLM
83
40
0
18 Jan 2024
Beyond Anti-Forgetting: Multimodal Continual Instruction Tuning with Positive Forward Transfer
Junhao Zheng
Qianli Ma
Zhen Liu
Binquan Wu
Hu Feng
CLL
18
14
0
17 Jan 2024
Developing ChatGPT for Biology and Medicine: A Complete Review of Biomedical Question Answering
Qing Li
Lei Li
Yu Li
LM&MA
AI4MH
25
6
0
15 Jan 2024
ModaVerse: Efficiently Transforming Modalities with LLMs
Xinyu Wang
Bohan Zhuang
Qi Wu
6
11
0
12 Jan 2024
GroundingGPT:Language Enhanced Multi-modal Grounding Model
Zhaowei Li
Qi Xu
Dong Zhang
Hang Song
Yiqing Cai
...
Junting Pan
Zefeng Li
Van Tu Vu
Zhida Huang
Tao Wang
18
36
0
11 Jan 2024
AccidentGPT: Large Multi-Modal Foundation Model for Traffic Accident Analysis
Kebin Wu
Wenbin Li
Xiaofei Xiao
11
2
0
05 Jan 2024
Data-Centric Foundation Models in Computational Healthcare: A Survey
Yunkun Zhang
Jin Gao
Zheling Tan
Lingfeng Zhou
Kexin Ding
Mu Zhou
Shaoting Zhang
Dequan Wang
AI4CE
21
20
0
04 Jan 2024
Ravnest: Decentralized Asynchronous Training on Heterogeneous Devices
A. Menon
Unnikrishnan Menon
Kailash Ahirwar
8
1
0
03 Jan 2024
Visual Instruction Tuning towards General-Purpose Multimodal Model: A Survey
Jiaxing Huang
Jingyi Zhang
Kai Jiang
Han Qiu
Shijian Lu
28
22
0
27 Dec 2023
Reverse Multi-Choice Dialogue Commonsense Inference with Graph-of-Thought
Limin Zheng
Hao Fei
Fei Li
Bobo Li
Lizi Liao
Donghong Ji
Chong Teng
18
7
0
23 Dec 2023
FoodLMM: A Versatile Food Assistant using Large Multi-modal Model
Yuehao Yin
Huiyan Qi
B. Zhu
Jingjing Chen
Yu-Gang Jiang
Chong-Wah Ngo
9
17
0
22 Dec 2023
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
Zhe Chen
Jiannan Wu
Wenhai Wang
Weijie Su
Guo Chen
...
Bin Li
Ping Luo
Tong Lu
Yu Qiao
Jifeng Dai
VLM
MLLM
144
895
0
21 Dec 2023
From Google Gemini to OpenAI Q* (Q-Star): A Survey of Reshaping the Generative Artificial Intelligence (AI) Research Landscape
Timothy R. McIntosh
Teo Susnjak
Tong Liu
Paul Watters
Malka N. Halgamuge
79
46
0
18 Dec 2023
Data-Efficient Multimodal Fusion on a Single GPU
Noël Vouitsis
Zhaoyan Liu
S. Gorti
Valentin Villecroze
Jesse C. Cresswell
Guangwei Yu
G. Loaiza-Ganem
M. Volkovs
32
3
0
15 Dec 2023
VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation
Jinguo Zhu
Xiaohan Ding
Yixiao Ge
Yuying Ge
Sijie Zhao
Hengshuang Zhao
Xiaohua Wang
Ying Shan
ViT
VLM
11
32
0
14 Dec 2023
DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral Planning States for Autonomous Driving
Wenhai Wang
Jiangwei Xie
ChuanYang Hu
Haoming Zou
Jianan Fan
...
Lewei Lu
Xizhou Zhu
Xiaogang Wang
Yu Qiao
Jifeng Dai
34
122
0
14 Dec 2023
Assessing GPT4-V on Structured Reasoning Tasks
Mukul Singh
J. Cambronero
Sumit Gulwani
Vu Le
Gust Verbruggen
LRM
35
10
0
13 Dec 2023
Modality Plug-and-Play: Elastic Modality Adaptation in Multimodal LLMs for Embodied AI
Kai Huang
Boyuan Yang
Wei Gao
15
1
0
13 Dec 2023
InstructAny2Pix: Flexible Visual Editing via Multimodal Instruction Following
Shufan Li
Harkanwar Singh
Aditya Grover
DiffM
8
7
0
11 Dec 2023
Multimodality of AI for Education: Towards Artificial General Intelligence
Gyeong-Geon Lee
Lehong Shi
Ehsan Latif
Yizhu Gao
Arne Bewersdorff
...
Zheng Liu
Hui Wang
Gengchen Mai
Tiaming Liu
Xiaoming Zhai
16
37
0
10 Dec 2023
Towards Knowledge-driven Autonomous Driving
Xin Li
Yeqi Bai
Pinlong Cai
Licheng Wen
Daocheng Fu
...
Yikang Li
Botian Shi
Yong-Jin Liu
Liang He
Yu Qiao
32
26
0
07 Dec 2023
LEGO: Learning EGOcentric Action Frame Generation via Visual Instruction Tuning
Bolin Lai
Xiaoliang Dai
Lawrence Chen
Guan Pang
James M. Rehg
Miao Liu
31
14
0
06 Dec 2023
Towards More Unified In-context Visual Understanding
Dianmo Sheng
Dongdong Chen
Zhentao Tan
Qiankun Liu
Qi Chu
Jianmin Bao
Tao Gong
Bin Liu
Shengwei Xu
Nenghai Yu
MLLM
VLM
22
3
0
05 Dec 2023
MedXChat: A Unified Multimodal Large Language Model Framework towards CXRs Understanding and Generation
Ling Yang
Zhanyu Wang
Zhenghao Chen
Xinyu Liang
Luping Zhou
LM&MA
MedIm
49
5
0
04 Dec 2023
StoryGPT-V: Large Language Models as Consistent Story Visualizers
Xiaoqian Shen
Mohamed Elhoseiny
VLM
85
9
0
04 Dec 2023
ChatPose: Chatting about 3D Human Pose
Yao Feng
Jing Lin
Sai Kumar Dwivedi
Yu Sun
Priyanka Patel
Michael J. Black
3DH
23
34
0
30 Nov 2023
CoDi-2: In-Context, Interleaved, and Interactive Any-to-Any Generation
Zineng Tang
Ziyi Yang
Mahmoud Khademi
Yang Liu
Chenguang Zhu
Mohit Bansal
LRM
MLLM
AuLLM
52
44
0
30 Nov 2023
MLLMs-Augmented Visual-Language Representation Learning
Yanqing Liu
Kai Wang
Wenqi Shao
Ping Luo
Yu Qiao
Mike Zheng Shou
Kaipeng Zhang
Yang You
VLM
16
11
0
30 Nov 2023
X-Dreamer: Creating High-quality 3D Content by Bridging the Domain Gap Between Text-to-2D and Text-to-3D Generation
Yiwei Ma
Yijun Fan
Jiayi Ji
Haowei Wang
Xiaoshuai Sun
Guannan Jiang
Annan Shu
Rongrong Ji
14
7
0
30 Nov 2023
M
2
^{2}
2
Chat: Empowering VLM for Multimodal LLM Interleaved Text-Image Generation
Xiaowei Chi
Rongyu Zhang
Zhengkai Jiang
Yijiang Liu
Ziyi Lin
...
Chaoyou Fu
Peng Gao
Shanghang Zhang
Qi-fei Liu
Yi-Ting Guo
MLLM
33
1
0
29 Nov 2023
SEED-Bench-2: Benchmarking Multimodal Large Language Models
Bohao Li
Yuying Ge
Yixiao Ge
Guangzhi Wang
Rui Wang
Ruimao Zhang
Ying Shan
MLLM
VLM
17
66
0
28 Nov 2023
ViT-Lens: Towards Omni-modal Representations
Weixian Lei
Yixiao Ge
Kun Yi
Jianfeng Zhang
Difei Gao
Dylan Sun
Yuying Ge
Ying Shan
Mike Zheng Shou
21
18
0
27 Nov 2023
LLMGA: Multimodal Large Language Model based Generation Assistant
Bin Xia
Shiyin Wang
Yingfan Tao
Yitong Wang
Jiaya Jia
MLLM
17
12
0
27 Nov 2023
Robot Learning in the Era of Foundation Models: A Survey
Xuan Xiao
Jiahang Liu
Zhipeng Wang
Yanmin Zhou
Yong Qi
Qian Cheng
Bin He
Shuo Jiang
AI4CE
LM&Ro
16
25
0
24 Nov 2023
Towards Better Parameter-Efficient Fine-Tuning for Large Language Models: A Position Paper
Chengyu Wang
Junbing Yan
Wei Zhang
Jun Huang
ALM
32
3
0
22 Nov 2023
Applications of Large Scale Foundation Models for Autonomous Driving
Yu Huang
Yue Chen
Zhu Li
ELM
AI4CE
LRM
ALM
LM&Ro
46
14
0
20 Nov 2023
LLMs as Visual Explainers: Advancing Image Classification with Evolving Visual Descriptions
Songhao Han
Le Zhuo
Yue Liao
Si Liu
VLM
16
13
0
20 Nov 2023
Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reasoning to Language Agents
Zhuosheng Zhang
Yao Yao
Aston Zhang
Xiangru Tang
Xinbei Ma
...
Yiming Wang
Mark B. Gerstein
Rui Wang
Gongshen Liu
Hai Zhao
LLMAG
LM&Ro
LRM
31
51
0
20 Nov 2023
Visual AI and Linguistic Intelligence Through Steerability and Composability
David A. Noever
S. M. Noever
32
0
0
18 Nov 2023
Previous
1
2
3
4
5
6
7
Next