ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.07728
29
0

Guiding Data Collection via Factored Scaling Curves

12 May 2025
Lihan Zha
Apurva Badithela
Michael Zhang
Justin Lidard
Jeremy Bao
Emily Zhou
David Snyder
Allen Z. Ren
Dhruv Shah
Anirudha Majumdar
    OffRL
ArXivPDFHTML
Abstract

Generalist imitation learning policies trained on large datasets show great promise for solving diverse manipulation tasks. However, to ensure generalization to different conditions, policies need to be trained with data collected across a large set of environmental factor variations (e.g., camera pose, table height, distractors) −-− a prohibitively expensive undertaking, if done exhaustively. We introduce a principled method for deciding what data to collect and how much to collect for each factor by constructing factored scaling curves (FSC), which quantify how policy performance varies as data scales along individual or paired factors. These curves enable targeted data acquisition for the most influential factor combinations within a given budget. We evaluate the proposed method through extensive simulated and real-world experiments, across both training-from-scratch and fine-tuning settings, and show that it boosts success rates in real-world tasks in new environments by up to 26% over existing data-collection strategies. We further demonstrate how factored scaling curves can effectively guide data collection using an offline metric, without requiring real-world evaluation at scale.

View on arXiv
@article{zha2025_2505.07728,
  title={ Guiding Data Collection via Factored Scaling Curves },
  author={ Lihan Zha and Apurva Badithela and Michael Zhang and Justin Lidard and Jeremy Bao and Emily Zhou and David Snyder and Allen Z. Ren and Dhruv Shah and Anirudha Majumdar },
  journal={arXiv preprint arXiv:2505.07728},
  year={ 2025 }
}
Comments on this paper