Skip to main content

Table 2 Comparison table of performance metrics for MLA to standard scoring systems, at time of severe sepsis onset

From: Validation of a machine learning algorithm for early severe sepsis prediction: a retrospective study predicting severe sepsis up to 48 h in advance using a diverse dataset from 461 US hospitals

 

MLA ≥ 0.029 DAD training

MLA ≥ 0.030 DAD testing

MLA ≥ 0.017 CHH external validation

MEWS ≥ 2 DAD testing

SOFA ≥ 2 DAD testing

SIRS ≥ 1 DAD testing

AUROC (SD)

0.931 (0.01)

0.930 (0.01)

0.948 (0.01)

0.725

0.716

0.655

P value (MLA vs comparator)

P < 0.001

P < 0.001

P < 0.001

Sensitivity

0.800

0.800

0.800

0.845

0.750

0.868

Specificity

0.926

0.933

0.921

0.444

0.554

0.334

Accuracy

0.923

0.929

0.920

0.608

0.645

0.646

DOR

53.105

56.508

47.532

4.358

3.720

3.290

LR+

11.411

12.110

10.306

1.521

1.680

1.303

LR−

0.216

0.215

0.217

0.349

0.452

0.396

  1. Detailed performance metrics for the Machine Learning Algorithm (MLA) and rules-based systems taken at the time of severe sepsis onset, using the Dascena Analysis Dataset for training and testing and the Cabell Huntington Hospital dataset for external validation. The score threshold reported for the MLA is the average over rounds of ten-fold cross-validation. AUROC for MLA versus comparators was performed using two-sample t-tests at 95% confidence. AUROC area under the receiver operating characteristic, MEWS Modified Early Warning Score, SOFA Sequential Organ Failure Assessment, SIRS Systemic Inflammatory Response Syndrome, DOR diagnostic odds ratio, LR likelihood ratio