Fig. 1From: Identifying and evaluating clinical subtypes of Alzheimer’s disease in care electronic health records using unsupervised machine learningReproducibility validation flow diagram showing how the AD cohort and UD cohort are used to validate the original AD clustering in different datasets: (1) splitting AD cohort into trial and test set, (2) using trial set to cluster patients using a cluster method, (3) split training set into a decision tree training and cross validation, then train a decision tree, (4) label test sets with trained decision tree (gold standard labels), (5) repeat cluster method, (6) find % discordance between decision tree labels and cluster labels to quantify reproducibility. AD Alzheimer's disease, UD unspecified dementiaBack to article page