ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.21152
19
0

SMOGAN: Synthetic Minority Oversampling with GAN Refinement for Imbalanced Regression

29 April 2025
Shayan Alahyari
Mike Domaratzki
ArXivPDFHTML
Abstract

Imbalanced regression refers to prediction tasks where the target variable is skewed. This skewness hinders machine learning models, especially neural networks, which concentrate on dense regions and therefore perform poorly on underrepresented (minority) samples. Despite the importance of this problem, only a few methods have been proposed for imbalanced regression. Many of the available solutions for imbalanced regression adapt techniques from the class imbalance domain, such as linear interpolation and the addition of Gaussian noise, to create synthetic data in sparse regions. However, in many cases, the underlying distribution of the data is complex and non-linear. Consequently, these approaches generate synthetic samples that do not accurately represent the true feature-target relationship. To overcome these limitations, we propose SMOGAN, a two-step oversampling framework for imbalanced regression. In Stage 1, an existing oversampler generates initial synthetic samples in sparse target regions. In Stage 2, we introduce DistGAN, a distribution-aware GAN that serves as SMOGAN's filtering layer and refines these samples via adversarial loss augmented with a Maximum Mean Discrepancy objective, aligning them with the true joint feature-target distribution. Extensive experiments on 23 imbalanced datasets show that SMOGAN consistently outperforms the default oversampling method without the DistGAN filtering layer.

View on arXiv
@article{alahyari2025_2504.21152,
  title={ SMOGAN: Synthetic Minority Oversampling with GAN Refinement for Imbalanced Regression },
  author={ Shayan Alahyari and Mike Domaratzki },
  journal={arXiv preprint arXiv:2504.21152},
  year={ 2025 }
}
Comments on this paper