v1v2v3 (latest)

Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning

Annual Meeting of the Association for Computational Linguistics (ACL), 2021

10 May 2021

Xiaodan Liang

Papers citing "Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning"

50 / 238 papers shown

GeoUni: A Unified Model for Generating Geometry Diagrams, Problems and Problem Solutions

330

14 Apr 2025

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

...

672

829

14 Apr 2025

VisuoThink: Empowering LVLM Reasoning with Multimodal Tree SearchAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

289

12 Apr 2025

Data Metabolism: An Efficient Data Design Schema For Vision Language Model

386

10 Apr 2025

Capybara-OMNI: An Efficient Paradigm for Building Omni-Modal Language Models

302

10 Apr 2025

SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual Reasoning Self-Improvement

599

10 Apr 2025

Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme

436

03 Apr 2025

Towards Scientific Intelligence: A Survey of LLM-based Scientific Agents

495

31 Mar 2025

Breaking Language Barriers in Visual Language Models via Multilingual Textual Regularization

Aitor Gonzalez-Agirre

Javier Hernando

Marta Villegas

VLM

473

28 Mar 2025

Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning of Vision Language Models

552

26 Mar 2025

MLLM-Selector: Necessity and Diversity-driven High-Value Data Selection for Enhanced Visual Instruction Tuning

502

26 Mar 2025

SlowFast-LLaVA-1.5: A Family of Token-Efficient Video Large Language Models for Long-Form Video Understanding

429

24 Mar 2025

MathAgent: Leveraging a Mixture-of-Math-Agent Framework for Real-World Multimodal Mathematical Error Detection

721

23 Mar 2025

VisNumBench: Evaluating Number Sense of Multimodal Large Language Models

290

19 Mar 2025

VisualPRM: An Effective Process Reward Model for Multimodal Reasoning

Weiyun Wang

Zhangwei Gao

Lawrence Yunliang Chen

...

349

13 Mar 2025

VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search

504

13 Mar 2025

LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL

733

187

10 Mar 2025

Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models

MU OffRL LRM MLLM ReLM VLM

600

361

09 Mar 2025

Can Atomic Step Decomposition Enhance the Self-structured Reasoning of Multimodal Large Models?

...

398

08 Mar 2025

Pi-GPS: Enhancing Geometry Problem Solving by Unleashing the Power of Diagrammatic Information

221

07 Mar 2025

PP-DocBee: Improving Multimodal Document Understanding Through a Bag of Tricks

462

06 Mar 2025

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

Abdelrahman Abouelenin

...

299

294

03 Mar 2025

Megrez-Omni Technical Report

...

235

19 Feb 2025

GeoDANO: Geometric VLM with Domain Agnostic Vision Encoder

266

17 Feb 2025

Investigating Inference-time Scaling for Chain of Multi-modal Thought: A Preliminary StudyAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

354

17 Feb 2025

Position: Multimodal Large Language Models Can Significantly Advance Scientific Reasoning

573

05 Feb 2025

Large Language Models are Few-shot Multivariate Time Series ClassifiersData mining and knowledge discovery (DMKD), 2025

203

30 Jan 2025

DrawEduMath: Evaluating Vision Language Models with Expert-Annotated Students' Hand-Drawn Math ImagesNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025

315

28 Jan 2025

Mathematical Language Models: A Survey

...

624

03 Jan 2025

HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language EmbeddingComputer Vision and Pattern Recognition (CVPR), 2024

...

514

20 Dec 2024

GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-trainingInternational Conference on Learning Representations (ICLR), 2024

...

465

16 Dec 2024

PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language ModelsComputer Vision and Pattern Recognition (CVPR), 2024

240

12 Dec 2024

Chimera: Improving Generalist Model with Domain-Specific Experts

...

618

08 Dec 2024

AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information?

...

282

03 Dec 2024

Eyes on the Road: State-of-the-Art Video Question Answering Models Assessment for Traffic Monitoring Tasks

Joseph Raj Vishal

Divesh Basina

Aarya Choudhary

Bharatesh Chakravarthi

403

02 Dec 2024

AtomThink: Multimodal Slow Thinking with Atomic Step Reasoning

...

611

18 Nov 2024

BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile DevicesComputer Vision and Pattern Recognition (CVPR), 2024

...

210

16 Nov 2024

Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization

...

536

186

15 Nov 2024

Theorem-Validated Reverse Chain-of-Thought Problem Generation for Geometric Reasoning

386

23 Oct 2024

Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance

Zhe Chen

...

404

21 Oct 2024

ViCToR: Improving Visual Comprehension via Token Reconstruction for Pretraining LMMs

373

18 Oct 2024

GeoCoder: Solving Geometry Problems by Generating Modular Code through Vision-Language ModelsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

159

17 Oct 2024

Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-trainingComputer Vision and Pattern Recognition (CVPR), 2024

Zhaokai Wang

Yu Qiao

Xizhou Zhu

VLM MLLM

383

10 Oct 2024

MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal ModelsInternational Conference on Learning Representations (ICLR), 2024

Pan Lu

Kai-Wei Chang

Nanyun Peng

VLM

370

10 Oct 2024

Polymath: A Challenging Multi-modal Mathematical Reasoning Benchmark

Himanshu Gupta

Shreyas Verma

Ujjwala Anantheswaran

Swaroop Mishra

263

06 Oct 2024

MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning

Haotian Zhang

Mingfei Gao

...

Zirui Wang

Yinfei Yang

307

30 Sep 2024

KALE-LM-Chem: Vision and Practice Toward an AI Brain for Chemistry

...

Xinhe Li

Yi Zhou

285

27 Sep 2024

InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning

Xiaotian Han

Yiren Jian

Xuefeng Hu

Haogeng Liu

Yiqi Wang

...

Yuang Ai

Huaibo Huang

Ran He

Zhenheng Yang

Quanzeng You

LRM AI4CE

206

19 Sep 2024

NVLM: Open Frontier-Class Multimodal LLMs

Wenliang Dai

Zihan Liu

308

114

17 Sep 2024

MathGLM-Vision: Solving Mathematical Problems with Multi-Modal Large Language Model

Zhen Yang

Jinhao Chen

Bin Xu

Yuxiao Dong

Jie Tang

VLM LRM

193

10 Sep 2024