Skip to main content

Table 1 Binary vector similarity measures

From: Analysis and identification of drug similarity through drug side effects and indications data

Measure

Equations

Description

Range

Jaccard

\(S_{Jaccard} = \frac{a}{a + b + c}\)

A normalization of inner product [23]

[0,1]

Dice

\(S_{Dice - 2} = \frac{a}{2a + b + c}\)

A normalization on inner product [24]

[0,1]

Tanimoto

\(S_{{{\text{Tan}} imoto}} = \frac{a}{(a + b) + (a + c) - a}\)

A normalization on inner product [22]

[0,1]

Ochiai

\(S_{Ochiai - 1} = \frac{a}{{\sqrt {(a + b) + (a + c)} }}\)

A normalization on inner product [25]

[0,1]

  1. Suppose that two objects or patterns, i and j are represented by the binary feature vector form. a is the number of features where the values of i and j are both 1 (or presence), meaning 'positive matches', b is the number of attributes where the value of i and j is (0,1), meaning 'i absence mismatches', c is the number of attributes where the value of i and j is (1,0), meaning 'j absence mismatches'