All Papers
Title |
|---|
Title |
|---|

Multi-animal tracking is crucial for understanding animal ecology and behavior. However, it remains a challenging task due to variations in habitat, motion patterns, and species appearance. Traditional approaches typically require extensive model fine-tuning and heuristic design for each application scenario. In this work, we explore the potential of recent vision foundation models for zero-shot multi-animal tracking. By combining a Grounding Dino object detector with the Segment Anything Model 2 (SAM 2) tracker and carefully designed heuristics, we develop a tracking framework that can be applied to new datasets without any retraining or hyperparameter adaptation. Evaluations on ChimpAct, Bird Flock Tracking, AnimalTrack, and a subset of GMOT-40 demonstrate strong and consistent performance across diverse species and environments. The code is available atthis https URL.
View on arXiv