43
0

8-Calves Image dataset

Abstract

We introduce the 8-Calves dataset, a benchmark for evaluating object detection and identity preservation in occlusion-rich, temporally consistent environments. Comprising a 1-hour video (67,760 frames) of eight Holstein Friesian calves with unique coat patterns and 900 static frames, the dataset emphasizes real-world challenges like prolonged occlusions, motion blur, and pose variation. By fine-tuning 28 object detectors (YOLO variants, transformers) and evaluating 23 pretrained backbones (ResNet, ConvNextV2, ViTs), we expose critical architectural trade-offs: smaller models (e.g., ConvNextV2 Nano, 15.6M parameters) excel in efficiency and retrieval accuracy, while pure vision transformers lag in occlusion-heavy settings. The dataset's structured design-fixed camera views, natural motion, and verified identities-provides a reproducible testbed for object detection challenges (mAP50:95: 56.5-66.4%), bridging synthetic simplicity and domain-specific complexity. The dataset and benchmark code are all publicly available atthis https URL. Limitations include partial labeling and detector bias, addressed in later sections.

View on arXiv
@article{fang2025_2503.13777,
  title={ 8-Calves Image dataset },
  author={ Xuyang Fang and Sion Hannuna and Neill Campbell },
  journal={arXiv preprint arXiv:2503.13777},
  year={ 2025 }
}
Comments on this paper