MambaFlow: A Novel and Flow-guided State Space Model for Scene Flow Estimation

Scene flow estimation aims to predict 3D motion from consecutive point cloud frames, which is of great interest in autonomous driving field. Existing methods face challenges such as insufficient spatio-temporal modeling and inherent loss of fine-grained feature during voxelization. However, the success of Mamba, a representative state space model (SSM) that enables global modeling with linear complexity, provides a promising solution. In this paper, we propose MambaFlow, a novel scene flow estimation network with a mamba-based decoder. It enables deep interaction and coupling of spatio-temporal features using a well-designed backbone. Innovatively, we steer the global attention modeling of voxel-based features with point offset information using an efficient Mamba-based decoder, learning voxel-to-point patterns that are used to devoxelize shared voxel representations into point-wise features. To further enhance the model's generalization capabilities across diverse scenarios, we propose a novel scene-adaptive loss function that automatically adapts to different motionthis http URLexperiments on the Argoverse 2 benchmark demonstrate that MambaFlow achieves state-of-the-art performance with real-time inference speed among existing works, enabling accurate flow estimation in real-world urban scenarios. The code is available atthis https URL.
View on arXiv@article{luo2025_2502.16907, title={ MambaFlow: A Novel and Flow-guided State Space Model for Scene Flow Estimation }, author={ Jiehao Luo and Jintao Cheng and Xiaoyu Tang and Qingwen Zhang and Bohuan Xue and Rui Fan }, journal={arXiv preprint arXiv:2502.16907}, year={ 2025 } }