Skip to main content

Table 1 The formula used in the taxonomic concept-based patient similarity

From: Using the distance between sets of hierarchical taxonomic clinical concepts to measure patient similarity

 

#

Formula

Reference

Information Content (IC)

1

levels(a → r)

[13]

2

\( -\log \left(\frac{\frac{\left| leaves(a)\right|}{\left| subsumers(a)\right|}+1}{\left| leaves(r)\right|+1}\right) \)

[14]

Code-level Similarity (CS)

1

\( \left\{\begin{array}{c}0,\kern0.5em if\ a=b\\ {}1,\kern0.5em otherwise\end{array}\right. \)

–

2

\( 1-\frac{2 IC(c)}{IC(a)+ IC(b)} \)

[16, 23]

3

\( 1-{e}^{\alpha \left( IC(a)+ IC(b)-2 IC(c)\right)}\bullet \frac{e^{\beta IC(c)}-{e}^{-\beta IC(c)}}{e^{\beta IC(c)}+{e}^{-\beta IC(c)}} \)

[17]

4

\( \frac{IC(l)- IC(c)}{IC(l)} \)

–

Set-level similarity (SS)

1

Dice

\( 1-\frac{2\mid A\cap B\mid }{\mid A\mid +\mid B\mid } \)

–

2

Jaccard

\( 1-\frac{\mid A\cap B\mid }{\mid A\cup B\mid } \)

–

3

Cosine

\( 1-\frac{\mid A\cap B\mid }{\sqrt{\mid A\mid \bullet \mid B\mid }} \)

–

4

Overlap

\( 1-\frac{\mid A\cap B\mid }{\min \left\{|A|,|B|\right\}} \)

–

5

\( \frac{1}{\mid A\mid +\mid B\mid}\bullet \left(\sum \limits_{a\in A}\underset{b\in B}{\min } CS\left(a,b\right)+\sum \limits_{b\in B}\underset{a\in A}{\min } CS\left(b,a\right)\right) \)

[17]

6

\( \frac{1}{\mid A\cup B\mid}\bullet \left(\sum \limits_{a\in A\setminus B}\frac{1}{\mid B\mid}\sum \limits_{b\in B} CS\left(a,b\right)+\sum \limits_{b\in B\setminus A}\frac{1}{\mid A\mid}\sum \limits_{a\in A} CS\left(b,a\right)\right) \)

[18]