7
0

Autonomous AI Surveillance: Multimodal Deep Learning for Cognitive and Behavioral Monitoring

Ameer Hamza
Zuhaib Hussain But
Umar Arif
Samiya
M. Abdullah Asad
Muhammad Naeem
Main:5 Pages
5 Figures
Bibliography:1 Pages
Abstract

This study presents a novel classroom surveillance system that integrates multiple modalities, including drowsiness, tracking of mobile phone usage, and face recognition,to assess student attentiveness with enhancedthis http URLsystem leverages the YOLOv8 model to detect both mobile phone and sleep usage,(Ghatge et al., 2024) while facial recognition is achieved through LResNet Occ FC body tracking using YOLO and MTCNN.(Durai et al., 2024) These models work in synergy to provide comprehensive, real-time monitoring, offering insights into student engagement and behavior.(S et al., 2023) The framework is trained on specialized datasets, such as the RMFD dataset for face recognition and a Roboflow dataset for mobile phone detection. The extensive evaluation of the system shows promising results. Sleep detection achieves 97. 42% mAP@50, face recognition achieves 86. 45% validation accuracy and mobile phone detection reach 85. 89% mAP@50. The system is implemented within a core PHP web application and utilizes ESP32-CAM hardware for seamless data capture.(Neto et al., 2024) This integrated approach not only enhances classroom monitoring, but also ensures automatic attendance recording via face recognition as students remain seated in the classroom, offering scalability for diverse educational environments.(Banada,2025)

View on arXiv
@article{hamza2025_2507.01590,
  title={ Autonomous AI Surveillance: Multimodal Deep Learning for Cognitive and Behavioral Monitoring },
  author={ Ameer Hamza and Zuhaib Hussain But and Umar Arif and Samiya and M. Abdullah Asad and Muhammad Naeem },
  journal={arXiv preprint arXiv:2507.01590},
  year={ 2025 }
}
Comments on this paper