Since the natural logarithmic function is continuous and increasing throughout its domain (0,\(\infty _+\)], it follows that as \(ln\left[ \frac{a}{1-b}\right]\) increases, the number of test iterations n needed to achieve a desired positive predictive value decreases as per Eq. (24). Tables 1, 2, 3 and 4 provide different reference values of n as a function of the prevalence \(\phi\) and the sensitivity and specificity for a \(\rho\) of 99, 95, 75 and 50%, respectively. Figure 2 provides a graphic representation of the \(n_i\), which given its geometric shape we define as the tablecloth function. The aforementioned relationship holds for a number of identical sequential tests that are positive until the \(n_i\) iteration reaches the desired positive predictive value. For severe conditions whose treatment is rather innocuous but whose potential consequences are severe, a lower threshold to initiate treatment might be acceptable. Conversely, a condition whose consequences are less severe but whose treatment may lead to significant morbidity might benefit from a higher degree of diagnostic certainty prior to initiating therapy or proceeding to an invasive diagnostic test. Given the extremes of the domains of each predictive function as per Eqs. (8) and (11), and the fact that most conditions have a prevalence well below 20% then it follows that if prior to reaching the desired positive predictive value, a negative test result is obtained, the individual is more likely to be disease-free, since \(\sigma (\phi ) \gg \rho (\phi )\) at a low prevalence level of disease (Fig. 1). In other words, the intersection between the NPV and PPV as per the following equation hovers around 40–60% prevalence for values of sensitivity and specificity greater than 50% (clinically useful ones). Below this point the NPV > PPV.
$$\begin{aligned} \phi _i=\frac{-b^2+b-\sqrt{ab\left( ab-a+1-b\right) }}{a^2-b^2-a+b} \end{aligned}$$
(25)
It is critical to bear in mind that testing might be done in a representative sample of a population to estimate the rate of asymptomatic carriage; in this case the prevalence is meaningful. But testing is generally done in subjects in whom a condition is suspected, either because they have a known exposure or because they have various levels of symptomatology. In such cases the population prevalence is irrelevant, and it would be more appropriate to refer to prior or pre-test probability instead.
Clinical implications of \(n_i\)
From the formula in (24), we learn that the number of iterations is inversely proportional to the ratio of sensitivity over the complement of the specificity - which represents the +LR [15].
$$\begin{aligned} n_i\propto {\frac{1}{ln\left[ \frac{a}{1-b}\right] }} \end{aligned}$$
(26)
However, the denominator of this equation is itself the natural logarithm of a fraction. It follows that for certain values of sensitivity a and specificity b, the ratio of \([\frac{a}{1-b}]\) is < 1. Since the natural logarithm of x follows the following range properties:
$$\begin{aligned} \ln (x) = \left\{ \begin{array}{ll} \in {\mathbb {C}} &{} {\text {if}}\; x \le 0 \\ undefined &{} {\text {if}}\;x = 0 \\< 0 &{} {\text {if}}\;0< x < 1 \\ \ge 0 &{} {\text {if}}\;x \ge 1\\ \end{array}\right. \end{aligned}$$
(27)
We deduce that for values of a and b such that:
$$\begin{aligned} a<1-b \Leftrightarrow a + b < 1 \end{aligned}$$
(28)
the denominator of the \(n_i\) function will be negative and so will thus be \(n_i\).
Though it is unlikely that a test whose sensitivity and specificity add to less than one would be often used clinically [17], this idea does lead to a fundamental understanding about the \(n_i\) equation. What does it mean to have a negative number of tests needed to achieve a given \(\rho (\phi )\)? Clinically it bears no meaning, since one would, by definition, need at least a single test to have a positive result. It thus follows that for the above equation to be of clinical use, we need to take its ceiling function [18], such that \(\lceil x \rceil\) is the unique integer satisfying \(\lceil x \rceil\) - 1 < x < \(\lceil x \rceil\):
$$\begin{aligned} n_i =\lim _{k \rightarrow \rho }\left\lceil \frac{ln\left[ \frac{k(\phi -1)}{\phi (k-1)}\right] }{ln\left[ \frac{a}{1-b}\right] }\right\rceil \end{aligned}$$
(29)
In practical terms, the ceiling function assigns the nearest higher positive integer to a number [18]. For the case of screening tests, it implies that a whole rather than a decimal number of tests (rounded to the nearest, higher, positive integer) ought to be performed. In other words, the ceiling function in this context serves to suggest that when say, 2.8 tests are needed to achieve a desired PPV, one is better off doing 3 tests given the discrete nature of tests. Doing 3 would by definition guarantee that one is above the desired threshold, but doing 2 tests would yield a lower PPV than that desired.
Independence of serial testing
From the concepts described in this work, one might easily suggest that simply repeating the same screening test multiple times increases confidence that a positive result is a true positive. Setting aside the administrative and feasibility concerns, while such an interpretation is theoretically correct, the reality ought to be more nuanced, as there are confounding factors that might make the same result recur upon serial testing on the same patient. Indeed, repeating the same test under the same conditions, in a similar time-frame, perhaps even by the same interpreter/provider may not constitute a true independent observation [19]. Likewise, temporally smooth fluctuations in the biological parameters being measured imply there should be a temporal separation between subsequent tests. Otherwise stated, the final results are valid only if the probability of receiving subsequent tests is independent of the result of those tests (i.e., we would continue testing those with negative tests in addition to those with positive tests). As such, the primary use of the tables and notions herein described ought to be to contextualize the screening result and broaden the clinical judgement of the provider with regards to the reliability of the screening process. A more natural and reliable method to enhance the positive predictive value would be, when available, to use a different test with different parameters altogether after an initial positive result is obtained [19].
Strengths and limitations
The work hereby presented is largely theoretical in nature. As such, it carries several strengths, notably, (1) the complete derivation of the resulting equation and tablecloth function from first principles, (2) the use of mathematical language that translates well into clinical scenarios (use of limits to ensure attainable PPV values and use of the ceiling function to achieve a whole number of tests necessary), (3) the development of easily accessible reference tables for clinicians to use and (4) the novelty of the work presented—as to the best of our knowledge, the idea of sequential testing and Bayesian updating with a single screening test has not previously been explored to a great extent [20]. Nevertheless, the present work has some limitations as well, notably: (1) the lack of clinical data to validate results, and (2) the concerns regarding its clinical application given the potential issues with obtaining independent testing samples. Despite these limitations, the purpose of this manuscript is to raise awareness about the poor predictive value of many screening tests given the Bayesian limitations of the screening process and to contextualize the way the predictive value can be enhanced with a single repeated test, even in theory. Such an equation can contextualize the predictive ability of a single test - and may provide additional ways to communicate risk or likelihood of disease in the clinical counselling of patients.