ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.02901
53
0
v1v2v3 (latest)

Near-Optimal Model Discrimination with Non-Disclosure

4 December 2020
Dmitrii Ostrovskii
M. Ndaoud
Adel Javanmard
Meisam Razaviyayn
    FedML
ArXiv (abs)PDFHTML
Abstract

Let θ0,θ1∈Rd\theta_0,\theta_1 \in \mathbb{R}^dθ0​,θ1​∈Rd be the population risk minimizers associated to some loss ℓ:Rd×Z→R\ell: \mathbb{R}^d \times \mathcal{Z} \to \mathbb{R}ℓ:Rd×Z→R and two distributions P0,P1\mathbb{P}_0,\mathbb{P}_1P0​,P1​ on Z\mathcal{Z}Z. We pose the following question: Given i.i.d. samples from P0\mathbb{P}_0P0​ and P1\mathbb{P}_1P1​, what sample sizes are sufficient and necessary to distinguish between the two hypotheses θ∗=θ0\theta^* = \theta_0θ∗=θ0​ and θ∗=θ1\theta^* = \theta_1θ∗=θ1​ for given θ∗∈{θ0,θ1}\theta^* \in \{\theta_0, \theta_1\}θ∗∈{θ0​,θ1​}? Making the first steps towards answering this question in full generality, we first consider the case of a well-specified linear model with squared loss. Here we provide matching upper and lower bounds on the sample complexity, showing it to be min⁡{1/Δ2,r/Δ}\min\{1/\Delta^2, \sqrt{r}/\Delta\}min{1/Δ2,r​/Δ} up to a constant factor, where Δ\DeltaΔ is a measure of separation between P0\mathbb{P}_0P0​ and P1\mathbb{P}_1P1​, and rrr is the rank of the design covariance matrix. This bound is dimension-independent, and rank-independent for large enough separation. We then extend this result in two directions: (i) for the general parametric setup in asymptotic regime; (ii) for generalized linear models in the small-sample regime n≤rn \le rn≤r and under weak moment assumptions. In both cases, we derive sample complexity bounds of a similar form, even under misspecification. Our testing procedures only access θ∗\theta^*θ∗ through a certain functional of empirical risk. In addition, the number of observations that allows to reach statistical confidence in our tests does not allow to "resolve" the two models -- that is, recover θ0,θ1\theta_0,\theta_1θ0​,θ1​ up to O(Δ)O(\Delta)O(Δ) prediction accuracy. These two properties allow to apply our framework in applied tasks where one would like to \textit{identify} a prediction model, which can be proprietary, while guaranteeing that the model cannot be actually \textit{inferred} by the identifying agent.

View on arXiv
Comments on this paper