OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction

5 March 2025

Papers citing "OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction"

4 / 4 papers shown

Title
3D CAVLA: Leveraging Depth and 3D Context to Generalize Vision Language Action Models for Unseen Tasks V. Bhat Yu-Hsiang Lan P. Krishnamurthy Ramesh Karri Farshad Khorrami 9 0 0 09 May 2025
Vision-Language-Action Models: Concepts, Progress, Applications and Challenges Ranjan Sapkota Yang Cao Konstantinos I Roumeliotis Manoj Karkee LM&Ro 60 0 0 07 May 2025
Generalization Capability for Imitation Learning Yixiao Wang 20 0 0 25 Apr 2025
$$π_{0.5}$: a Vision-Language-Action Model with Open-World Generalization$ $π_{0.5}$ : a Vision-Language-Action Model with Open-World Generalization Physical Intelligence Kevin Black Noah Brown James Darpinian Karan Dhabalia ... Homer Walke Anna Walling Haohuan Wang Lili Yu Ury Zhilinsky LM&Ro VLM 18 5 0 22 Apr 2025