Skip to main content

Table 2 Comparison table of performance metrics for MLA to standard scoring systems, at time of severe sepsis onset

From: Validation of a machine learning algorithm for early severe sepsis prediction: a retrospective study predicting severe sepsis up to 48 h in advance using a diverse dataset from 461 US hospitals

  MLA ≥ 0.029 DAD training MLA ≥ 0.030 DAD testing MLA ≥ 0.017 CHH external validation MEWS ≥ 2 DAD testing SOFA ≥ 2 DAD testing SIRS ≥ 1 DAD testing
AUROC (SD) 0.931 (0.01) 0.930 (0.01) 0.948 (0.01) 0.725 0.716 0.655
P value (MLA vs comparator) P < 0.001 P < 0.001 P < 0.001
Sensitivity 0.800 0.800 0.800 0.845 0.750 0.868
Specificity 0.926 0.933 0.921 0.444 0.554 0.334
Accuracy 0.923 0.929 0.920 0.608 0.645 0.646
DOR 53.105 56.508 47.532 4.358 3.720 3.290
LR+ 11.411 12.110 10.306 1.521 1.680 1.303
LR− 0.216 0.215 0.217 0.349 0.452 0.396
  1. Detailed performance metrics for the Machine Learning Algorithm (MLA) and rules-based systems taken at the time of severe sepsis onset, using the Dascena Analysis Dataset for training and testing and the Cabell Huntington Hospital dataset for external validation. The score threshold reported for the MLA is the average over rounds of ten-fold cross-validation. AUROC for MLA versus comparators was performed using two-sample t-tests at 95% confidence. AUROC area under the receiver operating characteristic, MEWS Modified Early Warning Score, SOFA Sequential Organ Failure Assessment, SIRS Systemic Inflammatory Response Syndrome, DOR diagnostic odds ratio, LR likelihood ratio