Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2412.02611
Cited By
AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information?
3 December 2024
Kaixiong Gong
Kaituo Feng
B. Li
Yibing Wang
Mofan Cheng
Shijia Yang
Jiaming Han
Benyou Wang
Yutong Bai
Z. Yang
Xiangyu Yue
MLLM
AuLLM
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information?"
4 / 4 papers shown
Title
Aurelia: Test-time Reasoning Distillation in Audio-Visual LLMs
Sanjoy Chowdhury
Hanan Gani
Nishit Anand
Sayan Nag
Ruohan Gao
Mohamed Elhoseiny
Salman Khan
Dinesh Manocha
LRM
36
0
0
29 Mar 2025
Qwen2.5-Omni Technical Report
Jin Xu
Zhifang Guo
Jinzheng He
Hangrui Hu
Ting He
...
K. Dang
Bin Zhang
X. Wang
Yunfei Chu
Junyang Lin
VGen
AuLLM
86
12
0
26 Mar 2025
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
Y. Wang
Shengqiong Wu
Y. Zhang
William Yang Wang
Ziwei Liu
Jiebo Luo
Hao Fei
LRM
80
7
0
16 Mar 2025
S2S-Arena, Evaluating Speech2Speech Protocols on Instruction Following with Paralinguistic Information
Feng Jiang
Zhiyu Lin
Fan Bu
Yuhao Du
Benyou Wang
H. Li
AuLLM
ELM
88
0
0
07 Mar 2025
1