Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2506.05328
Cited By
v1
v2 (latest)
AV-Reasoner: Improving and Benchmarking Clue-Grounded Audio-Visual Counting for MLLMs
5 June 2025
Lidong Lu
Guo Chen
Ruoyao Xiao
Yicheng Liu
Tong Lu
VLM
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (20 upvotes)
Papers citing
"AV-Reasoner: Improving and Benchmarking Clue-Grounded Audio-Visual Counting for MLLMs"
6 / 6 papers shown
EgoThinker: Unveiling Egocentric Reasoning with Spatio-Temporal CoT
Baoqi Pei
Yifei Huang
Jilan Xu
Yuping He
Guo Chen
Fei Wu
Yu Qiao
Jiangmiao Pang
EgoV
LRM
215
4
0
27 Oct 2025
XModBench: Benchmarking Cross-Modal Capabilities and Consistency in Omni-Language Models
Xingrui Wang
Jiang Liu
Chao Huang
X. Yu
Ze Wang
Ximeng Sun
Jialian Wu
Alan Yuille
Emad Barsoum
Zicheng Liu
VLM
101
0
0
16 Oct 2025
Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models
Yunlong Tang
Jing Bi
Pinxin Liu
Zhenyu Pan
Mingqian Feng
...
Zeliang Zhang
Daiki Shimada
Han Liu
Jiebo Luo
Chenliang Xu
MLLM
OffRL
VLM
LRM
744
8
0
06 Oct 2025
ReWatch-R1: Boosting Complex Video Reasoning in Large Vision-Language Models through Agentic Data Synthesis
Congzhi Zhang
Zhibin Wang
Yinchao Ma
Jiawei Peng
Y. Wang
Qiang Zhou
Jun Song
Bo Zheng
OffRL
AI4TS
LRM
230
2
0
28 Sep 2025
AVATAR: Reinforcement Learning to See, Hear, and Reason Over Video
Yogesh Kulkarni
Pooyan Fazli
OffRL
LRM
284
4
0
05 Aug 2025
VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models
Haodong Duan
Xinyu Fang
Junming Yang
Xiangyu Zhao
Lin Chen
...
Yuhang Zang
Pan Zhang
Jiaqi Wang
Dahua Lin
Kai Chen
LM&MA
VLM
725
358
0
16 Jul 2024
1