ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.08340
57
14
v1v2v3v4 (latest)

A unified framework for dataset shift diagnostics

17 May 2022
Felipe Maia Polo
Rafael Izbicki
Evanildo G. Lacerda
J. Ibieta-Jimenez
R. Vicente
ArXiv (abs)PDFHTML
Abstract

Most supervised learning methods assume that the data used in the training phase comes from the target population. However, in practice, one often faces dataset shift, which, if not adequately taken into account, may decrease the performance of their predictors. In this work, we propose a novel and flexible framework called DetectShift that enables quantification and testing of various types of dataset shifts, including shifts in the distributions of (X,Y)(X, Y)(X,Y), XXX, YYY, X∣YX|YX∣Y, and Y∣XY|XY∣X. DetectShift provides practitioners with insights about changes in their data, allowing them to leverage source and target data to retrain or adapt their predictors. That is particularly valuable in scenarios where labeled samples from the target domain are scarce. The framework utilizes test statistics with the same nature to quantify the magnitude of the various shifts, making results more interpretable. Moreover, it can be applied in both regression and classification tasks, as well as to different types of data such as tabular, text, and image data. Experimental results demonstrate the effectiveness of DetectShift in detecting dataset shifts even in higher dimensions. Our implementation for DetectShift can be found in https://github.com/felipemaiapolo/detectshift.

View on arXiv
Comments on this paper