# Table 1 The formula used in the taxonomic concept-based patient similarity

# Formula Reference
Information Content (IC) 1 levels(a → r) [13]
2 $$-\log \left(\frac{\frac{\left| leaves(a)\right|}{\left| subsumers(a)\right|}+1}{\left| leaves(r)\right|+1}\right)$$ [14]
Code-level Similarity (CS) 1 $$\left\{\begin{array}{c}0,\kern0.5em if\ a=b\\ {}1,\kern0.5em otherwise\end{array}\right.$$
2 $$1-\frac{2 IC(c)}{IC(a)+ IC(b)}$$ [16, 23]
3 $$1-{e}^{\alpha \left( IC(a)+ IC(b)-2 IC(c)\right)}\bullet \frac{e^{\beta IC(c)}-{e}^{-\beta IC(c)}}{e^{\beta IC(c)}+{e}^{-\beta IC(c)}}$$ [17]
4 $$\frac{IC(l)- IC(c)}{IC(l)}$$
Set-level similarity (SS) 1 Dice $$1-\frac{2\mid A\cap B\mid }{\mid A\mid +\mid B\mid }$$
2 Jaccard $$1-\frac{\mid A\cap B\mid }{\mid A\cup B\mid }$$
3 Cosine $$1-\frac{\mid A\cap B\mid }{\sqrt{\mid A\mid \bullet \mid B\mid }}$$
4 Overlap $$1-\frac{\mid A\cap B\mid }{\min \left\{|A|,|B|\right\}}$$
5 $$\frac{1}{\mid A\mid +\mid B\mid}\bullet \left(\sum \limits_{a\in A}\underset{b\in B}{\min } CS\left(a,b\right)+\sum \limits_{b\in B}\underset{a\in A}{\min } CS\left(b,a\right)\right)$$ [17]
6 $$\frac{1}{\mid A\cup B\mid}\bullet \left(\sum \limits_{a\in A\setminus B}\frac{1}{\mid B\mid}\sum \limits_{b\in B} CS\left(a,b\right)+\sum \limits_{b\in B\setminus A}\frac{1}{\mid A\mid}\sum \limits_{a\in A} CS\left(b,a\right)\right)$$ [18]