Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.00476
Cited By
TempCompass: Do Video LLMs Really Understand Videos?
1 March 2024
Yuanxin Liu
Shicheng Li
Yi Liu
Yuxiang Wang
Shuhuai Ren
Lei Li
Sishuo Chen
Xu Sun
Lu Hou
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"TempCompass: Do Video LLMs Really Understand Videos?"
32 / 82 papers shown
Title
VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?
Yunlong Tang
Junjia Guo
Hang Hua
Susan Liang
Mingqian Feng
...
Chao Huang
Jing Bi
Zeliang Zhang
Pooyan Fazli
Chenliang Xu
CoGe
74
8
0
17 Nov 2024
Thinking Before Looking: Improving Multimodal LLM Reasoning via Mitigating Visual Hallucination
Haojie Zheng
Tianyang Xu
Hanchi Sun
Shu Pu
Ruoxi Chen
Lichao Sun
MLLM
LRM
64
8
0
15 Nov 2024
VCBench: A Controllable Benchmark for Symbolic and Abstract Challenges in Video Cognition
Chenglin Li
Qianglong Chen
Zhi Li
Feng Tao
Yin Zhang
29
0
0
14 Nov 2024
HourVideo: 1-Hour Video-Language Understanding
Keshigeyan Chandrasegaran
Agrim Gupta
Lea M. Hadzic
Taran Kota
Jimming He
Cristobal Eyzaguirre
Zane Durante
Manling Li
Jiajun Wu
L. Fei-Fei
VLM
33
31
0
07 Nov 2024
When Does Perceptual Alignment Benefit Vision Representations?
Shobhita Sundaram
Stephanie Fu
Lukas Muttenthaler
Netanel Y. Tamir
Lucy Chai
Simon Kornblith
Trevor Darrell
Phillip Isola
47
12
1
14 Oct 2024
VLFeedback: A Large-Scale AI Feedback Dataset for Large Vision-Language Models Alignment
Lei Li
Zhihui Xie
Mukai Li
Shunian Chen
Peiyi Wang
L. Chen
Yazheng Yang
Benyou Wang
Lingpeng Kong
Q. Liu
VLM
ALM
34
16
0
12 Oct 2024
G
2
^{2}
2
TR: Generalized Grounded Temporal Reasoning for Robot Instruction Following by Combining Large Pre-trained Models
Riya Arora
N. N.
Aman Tambi
Sandeep S. Zachariah
Souvik Chakraborty
Rohan Paul
LM&Ro
26
0
0
10 Oct 2024
Temporal Reasoning Transfer from Text to Video
Lei Li
Yuanxin Liu
Linli Yao
Peiyuan Zhang
Chenxin An
Lean Wang
Xu Sun
Lingpeng Kong
Qi Liu
LRM
30
6
0
08 Oct 2024
Enhancing Temporal Modeling of Video LLMs via Time Gating
Zi-Yuan Hu
Yiwu Zhong
Shijia Huang
M. Lyu
Liwei Wang
VLM
26
0
0
08 Oct 2024
Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models
Haibo Wang
Zhiyang Xu
Yu Cheng
Shizhe Diao
Yufan Zhou
Yixin Cao
Qifan Wang
Weifeng Ge
Lifu Huang
19
20
0
04 Oct 2024
Vinoground: Scrutinizing LMMs over Dense Temporal Reasoning with Short Videos
Jianrui Zhang
Mu Cai
Yong Jae Lee
26
6
0
03 Oct 2024
Video Instruction Tuning With Synthetic Data
Yuanhan Zhang
Jinming Wu
Wei Li
Bo Li
Zejun Ma
Ziwei Liu
Chunyuan Li
SyDa
VGen
39
136
0
03 Oct 2024
E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding
Ye Liu
Zongyang Ma
Zhongang Qi
Yang Wu
Ying Shan
Chang Wen Chen
26
15
0
26 Sep 2024
EventHallusion: Diagnosing Event Hallucinations in Video LLMs
Jiacheng Zhang
Yang Jiao
Shaoxiang Chen
Jingjing Chen
Zhiyu Tan
Hao Li
Jingjing Chen
MLLM
59
17
0
25 Sep 2024
A Survey on Multimodal Benchmarks: In the Era of Large AI Models
Lin Li
Guikun Chen
Hanrong Shi
Jun Xiao
Long Chen
34
8
0
21 Sep 2024
Multimodal Datasets and Benchmarks for Reasoning about Dynamic Spatio-Temporality in Everyday Environments
Takanori Ugai
Kensho Hara
S. Egami
Ken Fukuda
VGen
18
0
0
21 Aug 2024
VideoQA in the Era of LLMs: An Empirical Study
Junbin Xiao
Nanxin Huang
Hangyu Qin
Dongyang Li
Yicong Li
...
Zhulin Tao
Jianxing Yu
Liang Lin
Tat-Seng Chua
Angela Yao
23
10
0
08 Aug 2024
MERLIN: Multimodal Embedding Refinement via LLM-based Iterative Navigation for Text-Video Retrieval-Rerank Pipeline
D. Han
Eunhwan Park
Gisang Lee
Adam Lee
Nojun Kwak
32
2
0
17 Jul 2024
Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision
Orr Zohar
Xiaohan Wang
Yonatan Bitton
Idan Szpektor
Serena Yeung-Levy
VLM
LRM
48
8
0
08 Jul 2024
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Pan Zhang
Xiaoyi Dong
Yuhang Zang
Yuhang Cao
Rui Qian
...
Kai Chen
Jifeng Dai
Yu Qiao
Dahua Lin
Jiaqi Wang
43
98
0
03 Jul 2024
AudioBench: A Universal Benchmark for Audio Large Language Models
Bin Wang
Xunlong Zou
Geyu Lin
S.
Zhuohan Liu
Wenyu Zhang
Zhengyuan Liu
AiTi Aw
Nancy F. Chen
AuLLM
ELM
LM&MA
85
17
0
23 Jun 2024
Towards Event-oriented Long Video Understanding
Yifan Du
Kun Zhou
Yuqi Huo
Yifan Li
Wayne Xin Zhao
Haoyu Lu
Zijia Zhao
Bingning Wang
Weipeng Chen
Ji-Rong Wen
VLM
19
13
0
20 Jun 2024
VideoVista: A Versatile Benchmark for Video Understanding and Reasoning
Yunxin Li
Xinyu Chen
Baotian Hu
Longyue Wang
Haoyuan Shi
Min-Ling Zhang
MLLM
LRM
38
25
0
17 Jun 2024
Needle In A Video Haystack: A Scalable Synthetic Evaluator for Video MLLMs
Zijia Zhao
Haoyu Lu
Yuqi Huo
Yifan Du
Tongtian Yue
Longteng Guo
Bingning Wang
Weipeng Chen
Jing Liu
28
2
0
13 Jun 2024
ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
Lin Chen
Xilin Wei
Jinsong Li
Xiaoyi Dong
Pan Zhang
...
Li Yuan
Yu Qiao
Dahua Lin
Feng Zhao
Jiaqi Wang
69
138
0
06 Jun 2024
Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Chaoyou Fu
Yuhan Dai
Yondong Luo
Lei Li
Shuhuai Ren
...
Tong Bill Xu
Xiawu Zheng
Enhong Chen
Rongrong Ji
Xing Sun
VLM
MLLM
41
216
0
31 May 2024
CityLLaVA: Efficient Fine-Tuning for VLMs in City Scenario
Zhizhao Duan
Hao Cheng
Duo Xu
Xi Wu
Xiangxie Zhang
Xi Ye
Zhen Xie
24
6
0
06 May 2024
Pegasus-v1 Technical Report
Raehyuk Jung
Hyojun Go
Jaehyuk Yi
Jiho Jang
Daniel Kim
...
Maninder Saini
Meredith Sanders
Soyoung Lee
Sue Kim
Travis Couture
MLLM
VLM
26
5
0
23 Apr 2024
Video Understanding with Large Language Models: A Survey
Yunlong Tang
Jing Bi
Siting Xu
Luchuan Song
Susan Liang
...
Feng Zheng
Jianguo Zhang
Ping Luo
Jiebo Luo
Chenliang Xu
VLM
47
76
0
29 Dec 2023
VITATECS: A Diagnostic Dataset for Temporal Concept Understanding of Video-Language Models
Shicheng Li
Lei Li
Shuhuai Ren
Yuanxin Liu
Yi Liu
Rundong Gao
Xu Sun
Lu Hou
24
28
0
29 Nov 2023
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
Bin Lin
Yang Ye
Bin Zhu
Jiaxi Cui
Munan Ning
Peng Jin
Li-ming Yuan
VLM
MLLM
185
576
0
16 Nov 2023
mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality
Qinghao Ye
Haiyang Xu
Guohai Xu
Jiabo Ye
Ming Yan
...
Junfeng Tian
Qiang Qi
Ji Zhang
Feiyan Huang
Jingren Zhou
VLM
MLLM
203
883
0
27 Apr 2023
Previous
1
2