ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2508.03404
  4. Cited By
Visual Document Understanding and Reasoning: A Multi-Agent Collaboration Framework with Agent-Wise Adaptive Test-Time Scaling
v1v2 (latest)

Visual Document Understanding and Reasoning: A Multi-Agent Collaboration Framework with Agent-Wise Adaptive Test-Time Scaling

5 August 2025
Xinlei Yu
Z. Chen
Yudong Zhang
Shilin Lu
Ruolin Shen
J. Zhang
Xiaobin Hu
Yanwei Fu
Shuicheng Yan
ArXiv (abs)PDFHTMLHuggingFace (3 upvotes)Github (7★)

Papers citing "Visual Document Understanding and Reasoning: A Multi-Agent Collaboration Framework with Agent-Wise Adaptive Test-Time Scaling"

13 / 13 papers shown
Diffusion-Based Image Editing: An Unforeseen Adversary to Robust Invisible Watermarks
Diffusion-Based Image Editing: An Unforeseen Adversary to Robust Invisible Watermarks
Wenkai Fu
Finn Carter
Y. Wang
Emily Davis
Bo Zhang
WIGM
433
0
0
05 Nov 2025
A Survey of Data Agents: Emerging Paradigm or Overstated Hype?
A Survey of Data Agents: Emerging Paradigm or Overstated Hype?
Yizhang Zhu
Liangwei Wang
Chenyu Yang
Xiaotian Lin
Boyan Li
...
Shaolei Zhang
Y. Zhang
Xuanhe Zhou
Guoliang Li
Yuyu Luo
AI4TS
189
2
0
27 Oct 2025
Scaling Beyond Context: A Survey of Multimodal Retrieval-Augmented Generation for Document Understanding
Scaling Beyond Context: A Survey of Multimodal Retrieval-Augmented Generation for Document Understanding
Sensen Gao
Shanshan Zhao
Xu Jiang
Lunhao Duan
Yong Xien Chng
Qing-Guo Chen
Weihua Luo
Kaifu Zhang
Jia-Wang Bian
Mingming Gong
272
1
0
17 Oct 2025
DeRainMamba: A Frequency-Aware State Space Model with Detail Enhancement for Image Deraining
DeRainMamba: A Frequency-Aware State Space Model with Detail Enhancement for Image DerainingIEEE Signal Processing Letters (IEEE SPL), 2025
Zhiliang Zhu
Tao Zeng
Tao Yang
Guoliang Luo
Jiyong Zeng
Mamba
228
0
0
08 Oct 2025
Diffusion-Based Image Editing for Breaking Robust Watermarks
Diffusion-Based Image Editing for Breaking Robust Watermarks
Yunyi Ni
Finn Carter
Ze Niu
Emily Davis
Bo Zhang
WIGM
462
1
0
07 Oct 2025
DragFlow: Unleashing DiT Priors with Region Based Supervision for Drag Editing
DragFlow: Unleashing DiT Priors with Region Based Supervision for Drag Editing
Zihan Zhou
Shilin Lu
Shuli Leng
Shaocong Zhang
Zhuming Lian
Xinlei Yu
A. Kong
DiffM
313
7
0
02 Oct 2025
Visual Multi-Agent System: Mitigating Hallucination Snowballing via Visual Flow
Visual Multi-Agent System: Mitigating Hallucination Snowballing via Visual Flow
Xinlei Yu
C. Xu
Guibin Zhang
Yongbo He
Zhangquan Chen
...
Jiangning Zhang
Yue Liao
Xiaobin Hu
Yu-Gang Jiang
Shuicheng Yan
243
3
0
26 Sep 2025
Does FLUX Already Know How to Perform Physically Plausible Image Composition?
Does FLUX Already Know How to Perform Physically Plausible Image Composition?
Shilin Lu
Zhuming Lian
Zihan Zhou
Shaocong Zhang
Chen Zhao
A. Kong
316
11
0
25 Sep 2025
SSCM: A Spatial-Semantic Consistent Model for Multi-Contrast MRI Super-Resolution
SSCM: A Spatial-Semantic Consistent Model for Multi-Contrast MRI Super-Resolution
Xiaoman Wu
Lubin Gan
Siying Wu
Jing Zhang
Yunwei Ou
Xiaoyan Sun
220
1
0
23 Sep 2025
MVCL-DAF++: Enhancing Multimodal Intent Recognition via Prototype-Aware Contrastive Alignment and Coarse-to-Fine Dynamic Attention Fusion
MVCL-DAF++: Enhancing Multimodal Intent Recognition via Prototype-Aware Contrastive Alignment and Coarse-to-Fine Dynamic Attention Fusion
Haofeng Huang
Yifei Han
Long Zhang
Bin Li
Yangfan He
183
0
0
22 Sep 2025
Robust Concept Erasure in Diffusion Models: A Theoretical Perspective on Security and Robustness
Robust Concept Erasure in Diffusion Models: A Theoretical Perspective on Security and Robustness
Zixuan Fu
Yan Ren
Finn Carter
Chenyue Wen
Le Ku
Daheng Yu
Emily Davis
Bo Zhang
DiffM
322
0
0
15 Sep 2025
LLM-Enhanced Multimodal Fusion for Cross-Domain Sequential Recommendation
LLM-Enhanced Multimodal Fusion for Cross-Domain Sequential Recommendation
Wangyu Wu
Zhenhong Chen
Xianglin Qiu
Siqi Song
Xiaowei Huang
Fei Ma
Jimin Xiao
AI4TS
195
12
0
22 Jun 2025
Intentional Gesture: Deliver Your Intentions with Gestures for Speech
Intentional Gesture: Deliver Your Intentions with Gestures for Speech
Pinxin Liu
Haiyang Liu
Luchuan Song
Chenliang Xu
Chenliang Xu
SLR
350
7
0
21 May 2025
1
Page 1 of 1