Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
2508.11737
Cited By

Ovis2.5 Technical Report

Ovis2.5 Technical Report

15 August 2025

ArXiv (abs)PDF HTML HuggingFace (97 upvotes)Github (1330★)

Papers citing "Ovis2.5 Technical Report"

13 / 13 papers shown

Jina-VLM: Small Multilingual Vision Language Model

Jina-VLM: Small Multilingual Vision Language Model

Andreas Koukounas

Georgios Mastrapas

Florian Hönicke

Sedigheh Eslami

Guillaume Roncari

356

0

0

03 Dec 2025

Ovis-Image Technical Report

Ovis-Image Technical Report

...

533

0

0

28 Nov 2025

You Only Forward Once: An Efficient Compositional Judging Paradigm

You Only Forward Once: An Efficient Compositional Judging Paradigm

134

0

0

20 Nov 2025

PRISMM-Bench: A Benchmark of Peer-Review Grounded Multimodal Inconsistencies

PRISMM-Bench: A Benchmark of Peer-Review Grounded Multimodal Inconsistencies

Muhammad Jehanzeb Mirza

227

0

0

18 Oct 2025

Scope: Selective Cross-modal Orchestration of Visual Perception Experts

Scope: Selective Cross-modal Orchestration of Visual Perception Experts

Juan A. Rodriguez

Perouz Taslakian

277

0

0

14 Oct 2025

FG-CLIP 2: A Bilingual Fine-grained Vision-Language Alignment Model

FG-CLIP 2: A Bilingual Fine-grained Vision-Language Alignment Model

247

3

0

13 Oct 2025

PhysToolBench: Benchmarking Physical Tool Understanding for MLLMs

PhysToolBench: Benchmarking Physical Tool Understanding for MLLMs

94

3

0

10 Oct 2025

Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models

Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models

...

MLLM OffRL VLM LRM

745

8

0

06 Oct 2025

Efficient Test-Time Scaling for Small Vision-Language Models

Efficient Test-Time Scaling for Small Vision-Language Models

Mehmet Onurcan Kaya

Desmond Elliott

Dim P. Papadopoulos

188

2

0

03 Oct 2025

From Perception to Cognition: A Survey of Vision-Language Interactive Reasoning in Multimodal Large Language Models

From Perception to Cognition: A Survey of Vision-Language Interactive Reasoning in Multimodal Large Language Models

...

453

9

0

29 Sep 2025

Visual Multi-Agent System: Mitigating Hallucination Snowballing via Visual Flow

Visual Multi-Agent System: Mitigating Hallucination Snowballing via Visual Flow

...

Jiangning Zhang

243

3

0

26 Sep 2025

SAIL-VL2 Technical Report

SAIL-VL2 Technical Report

...

296

4

0

17 Sep 2025

MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe

MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe

...

197

24

0

16 Sep 2025