Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2508.03404
Cited By
v1
v2 (latest)
Visual Document Understanding and Reasoning: A Multi-Agent Collaboration Framework with Agent-Wise Adaptive Test-Time Scaling
5 August 2025
Xinlei Yu
Z. Chen
Yudong Zhang
Shilin Lu
Ruolin Shen
J. Zhang
Xiaobin Hu
Yanwei Fu
Shuicheng Yan
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (3 upvotes)
Github (7★)
Papers citing
"Visual Document Understanding and Reasoning: A Multi-Agent Collaboration Framework with Agent-Wise Adaptive Test-Time Scaling"
13 / 13 papers shown
Diffusion-Based Image Editing: An Unforeseen Adversary to Robust Invisible Watermarks
Wenkai Fu
Finn Carter
Y. Wang
Emily Davis
Bo Zhang
WIGM
433
0
0
05 Nov 2025
A Survey of Data Agents: Emerging Paradigm or Overstated Hype?
Yizhang Zhu
Liangwei Wang
Chenyu Yang
Xiaotian Lin
Boyan Li
...
Shaolei Zhang
Y. Zhang
Xuanhe Zhou
Guoliang Li
Yuyu Luo
AI4TS
189
2
0
27 Oct 2025
Scaling Beyond Context: A Survey of Multimodal Retrieval-Augmented Generation for Document Understanding
Sensen Gao
Shanshan Zhao
Xu Jiang
Lunhao Duan
Yong Xien Chng
Qing-Guo Chen
Weihua Luo
Kaifu Zhang
Jia-Wang Bian
Mingming Gong
272
1
0
17 Oct 2025
DeRainMamba: A Frequency-Aware State Space Model with Detail Enhancement for Image Deraining
IEEE Signal Processing Letters (IEEE SPL), 2025
Zhiliang Zhu
Tao Zeng
Tao Yang
Guoliang Luo
Jiyong Zeng
Mamba
228
0
0
08 Oct 2025
Diffusion-Based Image Editing for Breaking Robust Watermarks
Yunyi Ni
Finn Carter
Ze Niu
Emily Davis
Bo Zhang
WIGM
462
1
0
07 Oct 2025
DragFlow: Unleashing DiT Priors with Region Based Supervision for Drag Editing
Zihan Zhou
Shilin Lu
Shuli Leng
Shaocong Zhang
Zhuming Lian
Xinlei Yu
A. Kong
DiffM
313
7
0
02 Oct 2025
Visual Multi-Agent System: Mitigating Hallucination Snowballing via Visual Flow
Xinlei Yu
C. Xu
Guibin Zhang
Yongbo He
Zhangquan Chen
...
Jiangning Zhang
Yue Liao
Xiaobin Hu
Yu-Gang Jiang
Shuicheng Yan
243
3
0
26 Sep 2025
Does FLUX Already Know How to Perform Physically Plausible Image Composition?
Shilin Lu
Zhuming Lian
Zihan Zhou
Shaocong Zhang
Chen Zhao
A. Kong
316
11
0
25 Sep 2025
SSCM: A Spatial-Semantic Consistent Model for Multi-Contrast MRI Super-Resolution
Xiaoman Wu
Lubin Gan
Siying Wu
Jing Zhang
Yunwei Ou
Xiaoyan Sun
220
1
0
23 Sep 2025
MVCL-DAF++: Enhancing Multimodal Intent Recognition via Prototype-Aware Contrastive Alignment and Coarse-to-Fine Dynamic Attention Fusion
Haofeng Huang
Yifei Han
Long Zhang
Bin Li
Yangfan He
183
0
0
22 Sep 2025
Robust Concept Erasure in Diffusion Models: A Theoretical Perspective on Security and Robustness
Zixuan Fu
Yan Ren
Finn Carter
Chenyue Wen
Le Ku
Daheng Yu
Emily Davis
Bo Zhang
DiffM
322
0
0
15 Sep 2025
LLM-Enhanced Multimodal Fusion for Cross-Domain Sequential Recommendation
Wangyu Wu
Zhenhong Chen
Xianglin Qiu
Siqi Song
Xiaowei Huang
Fei Ma
Jimin Xiao
AI4TS
195
12
0
22 Jun 2025
Intentional Gesture: Deliver Your Intentions with Gestures for Speech
Pinxin Liu
Haiyang Liu
Luchuan Song
Chenliang Xu
Chenliang Xu
SLR
350
7
0
21 May 2025
1
Page 1 of 1