This paper presents a comprehensive sim-to-real pipeline for autonomous strawberry picking from dense clusters using a Franka Panda robot. Our approach leverages a custom Mujoco simulation environment that integrates domain randomization techniques. In this environment, a deep reinforcement learning agent is trained using the dormant ratio minimization algorithm. The proposed pipeline bridges low-level control with high-level perception and decision making, demonstrating promising performance in both simulation and in a real laboratory environment, laying the groundwork for successful transfer to real-world autonomous fruit harvesting.
View on arXiv@article{williams2025_2505.08458, title={ Zero-Shot Sim-to-Real Reinforcement Learning for Fruit Harvesting }, author={ Emlyn Williams and Athanasios Polydoros }, journal={arXiv preprint arXiv:2505.08458}, year={ 2025 } }