Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2410.11538
Cited By
MCTBench: Multimodal Cognition towards Text-Rich Visual Scenes Benchmark
15 October 2024
Bin Shan
Xiang Fei
Wei Shi
An-Lan Wang
Guozhi Tang
Lei Liao
Jingqun Tang
Xiang Bai
Can Huang
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"MCTBench: Multimodal Cognition towards Text-Rich Visual Scenes Benchmark"
6 / 6 papers shown
Title
WildDoc: How Far Are We from Achieving Comprehensive and Robust Document Understanding in the Wild?
An-Lan Wang
Jingqun Tang
Liao Lei
Hao Feng
Qi Liu
...
Wen Liu
Hao Liu
Wenshu Fan
Xiang Bai
Can Huang
359
3
0
16 May 2025
TDRI: Two-Phase Dialogue Refinement and Co-Adaptation for Interactive Image Generation
Yuheng Feng
Jianhui Wang
Kun Li
Sida Li
Tianyu Shi
Haoyue Han
Miao Zhang
Xueqian Wang
DiffM
999
0
0
22 Mar 2025
Cross-Modal Synergies: Unveiling the Potential of Motion-Aware Fusion Networks in Handling Dynamic and Static ReID Scenarios
Fuxi Ling
Hongye Liu
Guoqiang Huang
Jing Li
Hong Wu
Zhihao Tang
355
0
0
02 Feb 2025
ParGo: Bridging Vision-Language with Partial and Global Views
AAAI Conference on Artificial Intelligence (AAAI), 2024
An-Lan Wang
Bin Shan
Wei Shi
Kun-Yu Lin
Xiang Fei
Guozhi Tang
Lei Liao
Jingqun Tang
Can Huang
Wei-Shi Zheng
MLLM
VLM
427
21
0
23 Aug 2024
MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering
Jingqun Tang
Qi-dong Liu
Yongjie Ye
Jinghui Lu
Shubo Wei
...
Hao Liu
Xiang Bai
Can Huang
Xiang Bai
Can Huang
654
46
0
20 May 2024
TextSquare: Scaling up Text-Centric Visual Instruction Tuning
Jingqun Tang
Chunhui Lin
Zhen Zhao
Shubo Wei
Binghong Wu
...
Yuliang Liu
Xiang Bai
Can Huang
Xiang Bai
Can Huang
LRM
VLM
MLLM
380
41
0
19 Apr 2024
1