We consider the problem of constructing distribution-free prediction sets when there are random effects. For iid data, prediction sets can be constructed using the method of conformal prediction. The validity of this prediction set hinges on the assumption that the data are exchangeable, which is not true when there are random effects. We extend the conformal method so that it is valid with random effects. We develop a CDF pooling approach, a single subsampling approach, and a repeated subsampling approach to construct conformal prediction sets in unsupervised and supervised settings. We compare these approaches in terms of coverage and average set size. We recommend the repeated subsampling approach that constructs a conformal set by sampling one observation from each distribution multiple times. Simulations show that this approach has the best balance between coverage and average conformal set size.
View on arXiv