Skip to main content
Figure 4 | BMC Medical Informatics and Decision Making

Figure 4

From: Improved de-identification of physician notes through integrative modeling of both public and private medical text

Figure 4

Part of speech is highly informative for the PHI class. Normalized pointwise mutual information was calculated between each Part Of Speech (POS) and PHI class. A score of 1 signifies the POS and PHI type always occur together. A score of −1 signifies the POS and PHI type never occur together. Clustering of the scoring matrix was calculated by Euclidean distance of the normalized scores. The results reveal that Nouns and Numbers have distinct groupings of PHI classes whereas all other parts of speech reduce the probability of any private PHI class.

Back to article page