Skip to main content

Table 3 Data extraction summary

From: Prediction and diagnosis of depression using machine learning with electronic health records data: a systematic review

Category

Description/example

Title

Title of journal/conference entry

Journal/ Conference

Publisher

Outcome Benchmark for depression

How outcome was measured (e.g., PHQ-9 (Patient Health Questionnaire 9), ICD (International Classification of Diseases) code, HADS (Hospital Anxiety and Depression Scale)

Demographic

Characteristics of the participant pool including age, gender, ethnicity etc. where specified

Data Source type

EHRs (Electronic Health Records), EMRs (Electronic Medical Records), Clinical Notes, Clinical Records

Data Specifications

Nature and source of data (e.g., types of codes used, organisation that provided the data)

Predictors

Types of predictors used by models and identification of any groupings or subsets they might fall into. The term “predictors” is considered interchangeable with “features” and “exposure variables” or other related terms

Study Design

Case/Control, Case Series, Cohort etc

Sample Size Training or Total

Number included in training/total dataset

Sample Size Testing/Validation

Number included in test/validation dataset

Missing Data

Explanation of how instances of missing data were addressed

Model Development Pre-Process

Information relating to the methods used for pre-processing, preparing, cleaning, extracting data (e.g., natural language and text processing methods)

Model Development Analysis (Fitting)

Information relating to the statistical methods used, ML (statistical techniques and/or broader AI e.g., neural networks). If relevant additional data pre-processing/preparation. Assessment of overfitting

Performance Metric

How model measured/reported (e.g., odds ratio, AUC ROC (Area Under Curve Receiver Operating Characteristic, Sensitivity, Specificity, Accuracy)

Baseline/Comparator

Criteria used to evaluate/compare model. How model assessed against outcome

Validation

Information relating to the use of validation methods

Testing

Independent testing and separate hold out set

Results

The results reported (may be in summary form)

Data Availability and sharing

Information relating to data availability, any repository/contact details and conditions that might apply

Code Availability and sharing

Information relating to code availability, any repository/contact details and conditions that might apply

Abstract

Text of study abstract

Full Reference (and Citation)

Supporting unambiguous identification of paper and providing source for citations in tables/figures/text