18
0

Lightweight LiDAR-Camera 3D Dynamic Object Detection and Multi-Class Trajectory Prediction

Abstract

Service mobile robots are often required to avoid dynamic objects while performing their tasks, but they usually have only limited computational resources. So we present a lightweight multi-modal framework for 3D object detection and trajectory prediction. Our system synergistically integrates LiDAR and camera inputs to achieve real-time perception of pedestrians, vehicles, and riders in 3D space. The framework proposes two novel modules: 1) a Cross-Modal Deformable Transformer (CMDT) for object detection with high accuracy and acceptable amount of computation, and 2) a Reference Trajectory-based Multi-Class Transformer (RTMCT) for efficient and diverse trajectory prediction of mult-class objects with flexible trajectory lengths. Evaluations on the CODa benchmark demonstrate superior performance over existing methods across detection (+2.03% in mAP) and trajectory prediction (-0.408m in minADE5 of pedestrians) metrics. Remarkably, the system exhibits exceptional deployability - when implemented on a wheelchair robot with an entry-level NVIDIA 3060 GPU, it achieves real-time inference at 13.2 fps. To facilitate reproducibility and practical deployment, we release the related code of the method atthis https URLand its ROS inference version atthis https URL.

View on arXiv
@article{he2025_2504.13647,
  title={ Lightweight LiDAR-Camera 3D Dynamic Object Detection and Multi-Class Trajectory Prediction },
  author={ Yushen He and Lei Zhao and Tianchen Deng and Zipeng Fang and Weidong Chen },
  journal={arXiv preprint arXiv:2504.13647},
  year={ 2025 }
}
Comments on this paper