ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.11208
56
0

PooDLe: Pooled and dense self-supervised learning from naturalistic videos

20 August 2024
Alex N. Wang
Christopher Hoang
Yuwen Xiong
Yann LeCun
Mengye Ren
ArXivPDFHTML
Abstract

Self-supervised learning has driven significant progress in learning from single-subject, iconic images. However, there are still unanswered questions about the use of minimally-curated, naturalistic video data, which contain dense scenes with many independent objects, imbalanced class distributions, and varying object sizes. In this paper, we propose PooDLe, a self-supervised learning method that combines an invariance-based objective on pooled representations with a dense SSL objective that enforces equivariance to optical flow warping. Our results show that a unified objective applied at multiple feature scales is essential for learning effective image representations from naturalistic videos. We validate our method with experiments on the BDD100K driving video dataset and the Walking Tours first-person video dataset, demonstrating its ability to capture spatial understanding from a dense objective and semantic understanding via a pooled representation objective.

View on arXiv
@article{wang2025_2408.11208,
  title={ PooDLe: Pooled and dense self-supervised learning from naturalistic videos },
  author={ Alex N. Wang and Christopher Hoang and Yuwen Xiong and Yann LeCun and Mengye Ren },
  journal={arXiv preprint arXiv:2408.11208},
  year={ 2025 }
}
Comments on this paper