v1v2v3v4 (latest)

ORACLE: Explaining Feature Interactions in Neural Networks with ANOVA

13 September 2025

Dongseok Kim

Wonjun Jeong

Mohamed Jismy Aashik Rasool

Gisung Oh

ArXiv (abs)PDF HTML Github

Main:7 Pages

4 Figures

Bibliography:3 Pages

12 Tables

Appendix:19 Pages

Abstract

We introduce ORACLE, a framework for explaining neural networks on tabular data and scientific factorial designs. ORACLE summarizes a trained network's prediction surface with main effects and pairwise interactions by treating the network as a black-box response, discretizing the inputs onto a grid, and fitting an orthogonal factorial (ANOVA-style) surrogate -- the $L^2$ orthogonal projection of the model response onto a finite-dimensional factorial subspace. A simple centering and $\mu$ -rebalancing step then expresses this surrogate as main- and interaction-effect tables that remain faithful to the original model in the $L^2$ sense. The resulting grid-based interaction maps are easy to visualize, comparable across backbones, and directly aligned with classical design-of-experiments practice. On synthetic factorial benchmarks and low- to medium-dimensional tabular regression tasks, ORACLE more accurately recovers ground-truth interaction structure and hotspots than Monte Carlo SHAP-family interaction methods, as measured by ranking, localization, and cross-backbone stability. We also discuss its scope in latent image and text settings: grid-based factorial surrogates are most effective when features admit an interpretable factorial structure, making ORACLE particularly well-suited to scientific and engineering workflows that require stable DoE-style interaction summaries.

View on arXiv

Comments on this paper