ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.06838
77
52
v1v2 (latest)

Finite-sample analysis of M-estimators using self-concordance

16 October 2018
Dmitrii Ostrovskii
Francis R. Bach
ArXiv (abs)PDFHTML
Abstract

The classical asymptotic theory for parametric MMM-estimators guarantees that, in the limit of infinite sample size, the excess risk has a chi-square type distribution, even in the misspecified case. We demonstrate how self-concordance of the loss allows to characterize the critical sample size sufficient to guarantee a chi-square type in-probability bound for the excess risk. Specifically, we consider two classes of losses: (i) self-concordant losses in the classical sense of Nesterov and Nemirovski, i.e., whose third derivative is uniformly bounded with the 3/23/23/2 power of the second derivative; (ii) pseudo self-concordant losses, for which the power is removed. These classes contain losses corresponding to several generalized linear models, including the logistic loss and pseudo-Huber losses. Our basic result under minimal assumptions bounds the critical sample size by O(d⋅deff),O(d \cdot d_{\text{eff}}),O(d⋅deff​), where ddd the parameter dimension and deffd_{\text{eff}}deff​ the effective dimension that accounts for model misspecification. In contrast to the existing results, we only impose local assumptions that concern the population risk minimizer θ∗\theta_*θ∗​. Namely, we assume that the calibrated design, i.e., design scaled by the square root of the second derivative of the loss, is subgaussian at θ∗\theta_*θ∗​. Besides, for type-ii losses we require boundedness of a certain measure of curvature of the population risk at θ∗\theta_*θ∗​.Our improved result bounds the critical sample size from above as O(max⁡{deff,dlog⁡d})O(\max\{d_{\text{eff}}, d \log d\})O(max{deff​,dlogd}) under slightly stronger assumptions. Namely, the local assumptions must hold in the neighborhood of θ∗\theta_*θ∗​ given by the Dikin ellipsoid of the population risk. Interestingly, we find that, for logistic regression with Gaussian design, there is no actual restriction of conditions: the subgaussian parameter and curvature measure remain near-constant over the Dikin ellipsoid. Finally, we extend some of these results to ℓ1\ell_1ℓ1​-penalized estimators in high dimensions.

View on arXiv
Comments on this paper