Skip to main content

Table 1 Pre- and in-pandemic performance of the native models.This table shows mean values and standard deviations of pre- and in-pandemic areas under receiver operating characteristic curves (AUROC) and areas under precision-recall curves (AUPR) as well as their percentage changes

From: Susceptibility of AutoML mortality prediction algorithms to model drift caused by the COVID pandemic

  

AUROC

AUPR

model

n

validation set

pre-pandemic

in-pandemic

% Change

p-value

validation set

pre-pandemic

in-pandemic

% Change

p-value

GLM

1

0.93

0.94

0.90

-3.85

 

0.17

0.19

0.08

-57.65

 

DRF

2

0.85 (0.01)

0.90 (0.00)

0.85 (0.02)

-4.98 (1.59)

0.14

0.07 (0.00)

0.09 (0.02)

0.07 (0.00)

-26.26 (15.57)

0.3

GBM

32

0.86 (0.05)

0.86 (0.06)

0.81 (0.06)

-6.26 (5.17)

< 0.001

0.10 (0.02)

0.19 (0.04)

0.07 (0.03)

-59.95 (18.33)

< 0.001

XGBoost

198

0.90 (0.01)

0.93 (0.01)

0.88 (0.02)

-5.36 (1.87)

< 0.001

0.11 (0.02)

0.24 (0.04)

0.07 (0.01)

-72.08 (8.10)

< 0.001

Deep Learning

22

0.88 (0.03)

0.90 (0.02)

0.84 (0.04)

-6.30 (3.15)

< 0.001

0.10 (0.02)

0.14 (0.03)

0.05 (0.01)

-62.17 (7.57)

< 0.001

Stacked

14

0.95 (0.01)

0.95 (0.01)

0.91 (0.01)

-4.70 (0.73)

< 0.001

0.27 (0.21)

0.26 (0.03)

0.09 (0.00)

-65.91 (6.86)

< 0.001

all

269

0.90 (0.03)

0.92 (0.03)

0.87 (0.04)

-5.50 (2.58)

< 0.001

0.12 (0.06)

0.22 (0.05)

0.07 (0.02)

-69.11 (11.41)

< 0.001

  1. GLM Generalized linear model, DRF Default random forest, GBM Gradient Boosting Machine, XGBoost eXtreme Gradient Boosting, Stacked Stacked Ensemble, all mean values of all models, n number of models created by AutoML. Pre- and in-pandemic AUROC and AUPRC are compared using paired t-test where p<0.05 was considered statistically significant. % Change indicates the mean percentage change of each model family