Interpreting a Recurrent Neural Network's Predictions of ICU Mortality
Risk
Despite the success of deep learning models in healthcare, their lack of transparency has impeded their acceptance. The goals of this work were to highlight which features contributed to a recurrent neural network's (RNN) predictions of ICU mortality and compare this information with clinical expectations. Feature contributions to the RNN's predictions for individual patients were computed using two methods: Learned Binary Masks (LBM), a new occlusion-based method developed here; and KernelSHAP, an existing model-agnostic interpretability method. Both methods compute the contribution of each input feature to the RNN's prediction at each time, generating a matrix of the same dimensions as the patient's input data matrix. Feature contributions were extracted, analyzed, and presented here for two patients whose RNN predictions displayed similar trajectories but with different diagnoses. LBM and KernelSHAP showed that the RNN used input features that aligned with the clinical expectation of each patient's disease trajectories. In addition, feature contributions were averaged across different sub-populations to compare between any two cohorts the feature contributions to their mortality predictions. This measure, called Relative Attribution Feature (RAF), is similar to risk factors distilled from traditional clinical research. The top 10 RAFs were computed for two well-studied diseases, sepsis and brain neoplasm, and they aligned with clinical expectations of each disease. Finally, feature contributions were averaged across all patients and times to generate a form of model "feature importance" which describes the overall importance of each feature, analogous to analyzing the weights in logistic regressions.
View on arXiv