MoIRA: Modular Instruction Routing Architecture for Multi-Task Robotics

2 July 2025

Dmytro Kuzmenko

Nadiya Shvai

MoE

ArXiv (abs)PDF HTML

Main:13 Pages

9 Figures

Bibliography:3 Pages

8 Tables

Abstract

Mixture-of-Experts (MoE) approaches have recently gained traction in robotics applications due to their ability to dynamically allocate computational resources and specialize sub-networks for distinct tasks or environmental contexts, enabling more efficient decision-making. Such systems often comprise sparsely activated experts combined under a single monolithic architecture and require a well-configured internal routing mechanism, which does not allow for selective low-level expert and router customization and requires additional training. We propose MoIRA, an architecture-agnostic modular MoE framework designed to coordinate existing experts with an external text-based router. MoIRA incorporates two zero-shot routing options: embedding-based similarity and prompt-driven language model inference. In our experiments, we choose large Vision-Language-Action models, gr00t-N1 and $\pi_0$ , as the underlying experts, and train low-rank adapters for low-overhead inference. We evaluate MoIRA on various GR1 Humanoid tasks and LIBERO Spatial and Goal benchmarks, where it consistently outperforms generalist models and competes with other MoE pipelines. Additionally, we analyse the robustness of the proposed approach to the variations of the instructions. While relying solely on textual descriptions of tasks and experts, MoIRA demonstrates the practical viability of modular deployment with precise, low-effort routing and provides an alternative, scalable foundation for future multi-expert robotic systems.

View on arXiv

@article{kuzmenko2025_2507.01843,
  title={ MoIRA: Modular Instruction Routing Architecture for Multi-Task Robotics },
  author={ Dmytro Kuzmenko and Nadiya Shvai },
  journal={arXiv preprint arXiv:2507.01843},
  year={ 2025 }
}

Comments on this paper