ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2412.09612
92
0

Olympus: A Universal Task Router for Computer Vision Tasks

12 December 2024
Yuanze Lin
Yunsheng Li
Dongdong Chen
Weijian Xu
Ronald Clark
Philip H. S. Torr
    VLM
    ObjD
ArXivPDFHTML
Abstract

We introduce Olympus, a new approach that transforms Multimodal Large Language Models (MLLMs) into a unified framework capable of handling a wide array of computer vision tasks. Utilizing a controller MLLM, Olympus delegates over 20 specialized tasks across images, videos, and 3D objects to dedicated modules. This instruction-based routing enables complex workflows through chained actions without the need for training heavy generative models. Olympus easily integrates with existing MLLMs, expanding their capabilities with comparable performance. Experimental results demonstrate that Olympus achieves an average routing accuracy of 94.75% across 20 tasks and precision of 91.82% in chained action scenarios, showcasing its effectiveness as a universal task router that can solve a diverse range of computer vision tasks. Project page:this http URL

View on arXiv
@article{lin2025_2412.09612,
  title={ Olympus: A Universal Task Router for Computer Vision Tasks },
  author={ Yuanze Lin and Yunsheng Li and Dongdong Chen and Weijian Xu and Ronald Clark and Philip H. S. Torr },
  journal={arXiv preprint arXiv:2412.09612},
  year={ 2025 }
}
Comments on this paper