OPAL: Encoding Causal Understanding of Physical Systems for Robot Learning
- AI4CE
Main:12 Pages
2 Figures
Bibliography:2 Pages
3 Tables
Appendix:5 Pages
Abstract
We present OPAL (Operant Physical Agent with Language), a novel vision-language-action architecture that introduces topological constraints to flow matching for robotic control. To do so, we further introduce topological attention. Our approach models action sequences as topologically-structured representations with non-trivial constraints. Experimental results across 10 complex manipulation tasks demonstrate OPAL's superior performance compared to previous approaches, including Octo, OpenVLA, and 0.
View on arXivComments on this paper
