ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.08809
20
0

Making Progress Based on False Discoveries

19 April 2022
Roi Livni
ArXivPDFHTML
Abstract

The study of adaptive data analysis examines how many statistical queries can be answered accurately using a fixed dataset while avoiding false discoveries (statistically inaccurate answers). In this paper, we tackle a question that precedes the field of study: Is data only valuable when it provides accurate answers to statistical queries? To answer this question, we use Stochastic Convex Optimization as a case study. In this model, algorithms are considered as analysts who query an estimate of the gradient of a noisy function at each iteration and move towards its minimizer. It is known that O(1/ϵ2)O(1/\epsilon^2)O(1/ϵ2) examples can be used to minimize the objective function, but none of the existing methods depend on the accuracy of the estimated gradients along the trajectory. Therefore, we ask: How many samples are needed to minimize a noisy convex function if we require ϵ\epsilonϵ-accurate estimates of O(1/ϵ2)O(1/\epsilon^2)O(1/ϵ2) gradients? Or, might it be that inaccurate gradient estimates are \emph{necessary} for finding the minimum of a stochastic convex function at an optimal statistical rate? We provide two partial answers to this question. First, we show that a general analyst (queries that may be maliciously chosen) requires Ω(1/ϵ3)\Omega(1/\epsilon^3)Ω(1/ϵ3) samples, ruling out the possibility of a foolproof mechanism. Second, we show that, under certain assumptions on the oracle, Ω~(1/ϵ2.5)\tilde \Omega(1/\epsilon^{2.5})Ω~(1/ϵ2.5) samples are necessary for gradient descent to interact with the oracle. Our results are in contrast to classical bounds that show that O(1/ϵ2)O(1/\epsilon^2)O(1/ϵ2) samples can optimize the population risk to an accuracy of O(ϵ)O(\epsilon)O(ϵ), but with spurious gradients.

View on arXiv
Comments on this paper