ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.01043
21
11

Distribution learning via neural differential equations: a nonparametric statistical perspective

3 September 2023
Youssef Marzouk
Zhi Ren
Sven Wang
Jakob Zech
ArXivPDFHTML
Abstract

Ordinary differential equations (ODEs), via their induced flow maps, provide a powerful framework to parameterize invertible transformations for the purpose of representing complex probability distributions. While such models have achieved enormous success in machine learning, particularly for generative modeling and density estimation, little is known about their statistical properties. This work establishes the first general nonparametric statistical convergence analysis for distribution learning via ODE models trained through likelihood maximization. We first prove a convergence theorem applicable to arbitrary velocity field classes F\mathcal{F}F satisfying certain simple boundary constraints. This general result captures the trade-off between approximation error (`bias') and the complexity of the ODE model (`variance'). We show that the latter can be quantified via the C1C^1C1-metric entropy of the class F\mathcal FF. We then apply this general framework to the setting of CkC^kCk-smooth target densities, and establish nearly minimax-optimal convergence rates for two relevant velocity field classes F\mathcal FF: CkC^kCk functions and neural networks. The latter is the practically important case of neural ODEs. Our proof techniques require a careful synthesis of (i) analytical stability results for ODEs, (ii) classical theory for sieved M-estimators, and (iii) recent results on approximation rates and metric entropies of neural network classes. The results also provide theoretical insight on how the choice of velocity field class, and the dependence of this choice on sample size nnn (e.g., the scaling of width, depth, and sparsity of neural network classes), impacts statistical performance.

View on arXiv
Comments on this paper