Decision analysis based on regret theory
Figure 1 depicts a typical decision tree describing administration of treatment guided by a prediction model. There are two competing strategies (treat, and do not treat), and four possible outcomes as described by the combinations: treat/do not treat and necessary/unnecessary.
In Figure 1, p = P(D +)is the probability associated with the presence of the disease as estimated by a prediction model;1 - p = P(D -)is the probability associated with the absence of the disease, and, U
i
, i ∈ [1,4], are the utilities corresponding to each outcome. For example, U
1 is the utility of administering treatment to a patient who has the disease (e.g. treat when necessary), and U
2 is the utility of administering treatment to a patient who does not have the disease (e.g. administering unnecessary treatment). Note that we use the term "treatment" in the generic sense of health care intervention, which may indicate therapy, procedure, or a diagnostic test.
The probabilistic nature of prognostication models complicates significantly the decision process. For example, if a prediction model estimates the probability of a patient having a disease equal to 40%, it is unclear whether this patient should receive treatment or not. A solution from the point of view of the classical decision theory is to employ the concept of threshold probability P
t
, which is defined as the probability at which the decision maker is indifferent between two strategies (e.g. administer treatment or not)[27, 29, 30]. Based on the threshold concept, the patient should be treated if p ≥ P
t
and should not be treated otherwise.
However, since in most cases decisions are made under uncertainty and can never be 100% accurate [23, 26, 28, 31–34]. Thus, after a decision has been made one may discover that another alternative would have been preferable. This knowledge may bring a sense of loss or regret to the decision maker[23, 26, 28, 31–34]. Regret can be particularly strong when the consequences of wrong decisions are life threatening or seriously influence the quality of the patient's life.
Formally, regret can be expressed as the difference between the utility of the outcome of the action taken and the utility of the outcome of the action that, in retrospect, should have been taken [23, 26, 28, 31–34]. Regret can be felt by any party involved in the decision-making process (e.g. patients receiving treatment, patient's proxies or physicians administering treatment). For the rest of this paper we assume that the decision maker is the treating physician.
We first employ regret theory to estimate the threshold probability, P
t
, at which the physician is indifferent between alternative management strategies (e.g. administer treatment or not). In order to accomplish this, we describe regret in terms of the errors of (1) not treating the patient who has the disease, and (2) treating the patient who does not have the disease.
Figure 2 describes the derivation of regret associated with each strategy based on the utilities of each action's outcome. As can be noted, the regret associated with the error of not treating the patient when he/she should have received treatment (the probability of disease is p ≥ P
t
), Rg(Rx-, D+), is equal to the loss in benefits of treatment. This can be expressed as the difference between the utility of receiving treatment and having the disease, and the utility of not receiving treatment and having the disease (U1-U3).
Similarly, the regret associated with treating the patient who should not have received treatment (the probability of disease is p < P
t
), Rg(Rx+, D-), is equal to harms incurred due to treatment. This can be expressed as the difference between the utilities of not having the disease and not receiving treatment, and not having the disease and receiving treatment (U4-U2). We expect no regret in the cases of correct treat/no treat decisions, Rg(Rx+, D+) = Rg(Rx-, D-) = 0. The difference (U1-U3) represents the consequences of not administering treatment where indicated, while (U4-U2) represents the consequences of administering treatment to a patient who does not need it. Under these assumptions, the threshold probability, P
t
is equal to [27, 29, 30]:
(1)
Equation 1 effectively captures the preferences of the decision maker towards administering or not administering treatment. At the individual level, equation 1 shows how the threshold probability relates to the way the decision maker weighs false negative (i.e. failing to provide necessary treatment) vs. false positive (i.e. administering unnecessary treatment) results[24, 25].
Note that the fraction is undefined for U
4 - U
2 = 0, which means that in this situation there is no regret associated with administering unnecessary treatment. Under these circumstances, P
t
= 100%, indicating that treatment is justified only in case of absolute certainty of disease (p = 100%), a realistically unachievable goal[26].
Elicitation of threshold probability
There are numerous techniques for eliciting the decision maker's preferences regarding treatment administration [35]. None of them has been proven to be better than the other. We argue that any attempt to measure people's preferences and risk attitudes should be derived from an underlying theory of decision-making that can be applied to a problem or a class of the problems at hand. We approach elicitation of preferences by capturing people attitudes (e.g. physicians') through threshold probabilities. Normatively, a threshold probability reflects indifference between two alternative management strategies.
There are few commonly used methods to assess the value of this indifference for a decision maker such as the standard gamble, and the time trade-off [35–37]. The problem is that both standard gamble and time trade-off are time-consuming, cognitively more complex and are shown that can lead to biased estimates of people's preferences [36, 37]. An alternative method is to use rating scales, such as visual analog scales (VAS), which are considerably easier to administer and better understood by the participants. The problem with analog scales, however, is that they cannot capture health state trade-offs[36, 37].
The proposed method retains the simplicity of VAS but it takes into account the consequences of possible mistakes in decision-making by utilizing two visual analog scales. The first scale aims to assess the regret associated with potential error of failing to administer beneficial treatment ("regret of omission"). The second scale measures the regret of administration of unnecessary treatment ("regret of commission"). Using these two scales we can capture trade-offs and compute the threshold probability at which a decision maker is indifferent between two alternative management strategies.
We employed the two visual analog scales with typical 100 points [35–37]anchored by no regret and maximal regret. This is modeled after pain assessment limiting the maximum possible pain that a person can experience [38]. Accordingly, we can elicit threshold probabilities by asking the physician to weigh the regret associated with wrong decisions (e.g. giving unnecessary treatment vs. failure to administer necessary treatment) using a numerical (0 to 100) scale. The questions may be narrowly defined related to specific outcomes (e.g., survival/mortality, heart attack etc.). We should, however, note that most treatments are associated with multiple dimensions, some good and some bad. This is a fundamental reason why no universally accepted method for assessment of decision-makers' preferences has been developed so far. It is very difficult, if not impossible, to accurately determine the trade-offs across multiple outcomes that can be permuted in a number of ways. A solution to this problem is to capture the decision-maker's global or "holistic" perception toward treatment. By asking questions about trade-offs in this way, we directly address both cognitive mechanisms-intuitive and deliberative- of the decision process. This, in turn, can lead to more accurate assessment of the decision makers' preferences.
For example, to elicit the physician's threshold probability, we may ask the following questions:
1. On a scale 0 to 100, where 0 indicates no regret and 100 indicates the maximum regret you could feel, how would you rate the level of your regret if you failed to provide necessary treatment to your patient (i.e. did not give treatment that, in retrospect, you should have given)? [Note that the answer to this question corresponds to the (U1-U3) expression in equation 1)].
2. On a scale 0 to 100, where 0 indicates no regret and 100 indicates the maximum regret you could feel, how would you rate the level of your regret if you had administered unnecessary treatment to your patient (i.e. administered treatment that, in retrospect, should have not been given)? [Note the answer to this question corresponds to the (U4-U2) expression in equation 1).]
For example, suppose that the physician answers 60 and 30 to the questions 1 and 2, respectively. This means that the physician considers 60/30 = 2 times worse to fail to administer treatment that should have been given than to continue unnecessary treatment. Then, the threshold probability for this physician is:
Thus, the physician would be unsure as to whether to treat or not the patient if the patient's probability of disease as computed by the prediction model was 33%. Thus, the recommended action, which is based on elicitation of the decision-maker preferences, is directly derived from the underlying theoretical model.
Regret based decision curve analysis (DCA)
Decision-makers may be presented with many alternative strategies that can be difficult to model. A simple, yet powerful approach that is based on experience of a typical practicing physician is to compare the strategy based on modeling with those scenarios when all or no patient is treated. That is, the clinical alternatives to the prediction model strategy is to assume that all patients have the disease and thus treat them all, or to assume that no patient has the disease and thus treat none[25]. In this case the clinical dilemma a physician faces when considering treatment is threefold: (1) treat all the patients ("treat all"), (2)treat no patients ("treat none"), and (3) use a prediction model and treat a patient if p ≥ P
t
("model").
The optimal decision depends on the preferences of the decision maker as captured by the threshold probability. We use Decision Curve Analysis (DCA) [24, 25] to identify the range of threshold probabilities at which each strategy ("treat all", "treat none", and "model") is of value. Traditional DCA uses the (net expected) benefits associated with each strategy to recommend the best strategy [24, 25]. In this work, we consider that the optimal strategy is the one that brings the least regret in case it is proven wrong, retrospectively.
One view about decision curves is that they should not be used in clinical practice: the researcher determines whether the decision curve justifies the use of the model in practice and then makes a simple recommendation yes or no as to whether clinicians should base their decisions on the model [39]. Another approach, which we propose here, is that threshold probabilities obtained in clinical practice should be compared against the decision curve to determine which strategy should be used (e.g. use a model, biopsy all men, biopsy no-one). This might be necessary if there is no strategy with the highest net benefit across the entire range of reasonable threshold probabilities.
Figure 3, depicts the generalized decision tree describing all of the alternative strategies. By solving the decision tree, we can estimate the expected regret associated with each strategy[23, 26, 28, 31–34]. For example,
(2)
Here, FN (probability of false negatives) represents the conditional probability P(p < P
t
|D +)of not treating the patient who has the disease.
FP (probability of false positives) is the conditional probability P(p ≥ P
t
|D -)of treating the patient who does not have the disease.
Similarly,
TP = 1 - FN = P(p ≥ P
t|D +)(probability of true positives): Probability of treating the patient who has the disease.
TN = 1 - FP = P(p < P
t|D -) (probability of true negatives): Probability of not treating the patient who does not have the disease.
After re-scaling the utilities by dividing each utility with the expression U
1 - U
3, and replacing , we get the expression:
(3)
For the strategies of administering treatment and not administering treatment, the expected regret is derived as:
(4)
(5)
Subtracting each of these expected regrets from the expected regret of the "Treat none" (baseline) strategy we obtain the "Net Expected Regret Difference (NERD)":
(6)
(7)
(8)
Note that these are exactly the same formulas as those derived by Vickers and Elkin [25] who employ the expected-utility model in "decision curve analysis" (DCA). The regret based derivation, however, is mathematically more parsimonious. The original DCA formulation required several mathematical manipulations making the simplicity of regret approach more attractive. In addition, as argued throughout the manuscript, the regret formulation may have additional decision-theoretical advantages as it enables experiencing consequences of decisions both at the emotional (system 1) and cognitive (system 2) level[23, 40].
In addition to equations 6-8, we are interested in the NERD between the strategies "Treat all" and "Model":
(9)
The NERD equations associated with each strategy, 6-8, can be further reformulated as follows [23, 25, 26, 28, 31–34, 41]:
(10)
Similarly, equation 9 can be re-written as:
(11)
Equations 10 and 11 above are useful when calculating NERD as a function of P
t
. The probabilities P(p ≥ P
t
∩ D +), P(p ≥ P
t
∩ D -), P(p ≥ P
t
∩ D +), and P(p ≥ P
t
∩ D -) are estimated as follows:
-
P(p ≥ P
t
∩ D +) ≈ the number of patients who have the disease and for whom the prognostic probability is greater than or equal to P
t
(with #TP = number of patients with true positive results, , where n is the total number of patients in the study).
-
P(p ≥ P
t
∩ D -) ≈ the number of patients who do not have the disease and for whom the prognostic probability of disease is greater than or equal to P
t
(with #FP = number of patients with false positive results, .
-
P(p < P
t
∩ D +) ≈ the number of patients who have the disease and for whom the prognostic probability of disease is less than P
t
(with #TN = number of patients with true negative results, .
-
P(p < P
t
∩ D -) ≈ the number of patients who do not have the disease and for whom the prognostic probability of disease is less than P
t
(with #FN=number of patients with false negative results, .
When computing NERD[Treat none, treat all] we assume that all patients have the disease, thus #TP is the number of people who actually have the disease and #FP is the number of people who do not have the disease but are given treatment. On the other hand, when computing NERD[Treat none, Model]from equation 10 and, NERD[Treat all, Model]from equation 11, #TP, #FP, #TN, and #FN are computed for each threshold probability assuming that a patient has the disease if the prognostic probability is greater than or equal to the threshold probability and does not have the disease, otherwise.
NERDs of each of the strategies described are plotted against different values of threshold probability. The NERD values provide information relative to decrease in regret when two strategies are compared against each other for a given threshold probability. If NERD = 0, this means that there is no difference in the regret between two strategies:
(12)
If NERD > 0, this means that the second strategy will inflict less regret than the first strategy, and hence it is preferable:
(13)
Similarly, if NERD < 0, the first strategy represents the optimal decision among the two strategies:
(14)
The algorithm for the Regret DCA is implemented as follows:
-
1.
Select a value for threshold probability.
-
2.
Assuming that patients should be treated if p ≥ P
t
and should not be treated otherwise, compute #TP and #FP for the prediction model.
-
3.
Calculate the NERD(Treat none, Model)using equation 10.
-
4.
Calculate NERD(Treat all, Model)using equation 11.
-
5.
Compute the NERD(Treat none, Treat all)using equation 10 where #TP is the number of patients having the disease and #FP is the number of patients without disease who got treatment.
-
6.
Repeat steps 1 - 6 for a range of threshold probabilities.
-
7.
Graph each NERD calculated in steps 3-5 against each threshold probability.
Based on the Regret DCA methodology, the optimal decision at each threshold probability is derived by comparing each pair of strategies through their corresponding NERDs according to the transitivity principle (i.e., if A > B, B > C then A > C). Thus, if NERD(strategy 1, strategy 2) > NERD(strategy 2, strategy 3) > 0 then strategy 2 is better than strategy 1, and strategy 3 is better than strategy 2. Therefore, strategy 3 is the optimal strategy.
Acceptable Regret
No decision model can guarantee that the recommended strategy will be the correct one. Therefore, we can always make a mistake and recommend treatment we should not have, or fail to recommend treatment we should have administered [42]. However, there are situations where the regret resulting from a wrong decision will be tolerable. These situations are best described under the notion of acceptable regret [26, 28, 31]. Formally, acceptable regret,Rg
0, is defined as the portion of utility a decision maker is willing to lose/sacrifice when he/she adheres to a decision that may prove wrong [26, 28, 31, 32]. For example, a physician may regret administering unnecessary treatment to a patient but he/she can "still live with" the consequences of this decision if she/he judged them to be trivial or inconsequential.
We assume that there is a linear relationship between the value of acceptable regret and the benefits of receiving treatment as well as the harms of receiving unnecessary treatment. This is a reasonable assumption because acceptable regret is expected to operate within a narrow range, at the lower or the upper end, of the probability scale. We define acceptable regret in terms of benefits of treatment,Rg
b
, as [43] the percentage (r
b
) of benefits (U
1 - U
3)the decision maker is willing to forgo if his/her decision NOT to treat was wrong:
(15)
Alternatively, we define acceptable regret in terms of harms of unnecessary treatment, Rg
h
, as[43] the percentage (r
h
) of harms (U4 - U2) the decision maker is willing to incur if his/her decision of treating was wrong:
(16)
We use the concept of acceptable regret to further refine the conditions under which the decision maker is indifferent between two strategies. Recall that these conditions have been initially captured in terms of threshold probability, which does not incorporate the sense of tolerable losses. Thus, we proceed with the following definition: Two strategies are considered equivalent in regret (e.g. will bring the same regret to the decision maker if they are proven wrong, in retrospect), if the absolute value of their net expected regret difference (NERD) is less than or equal to a predetermined amount of acceptable regret Rg
0. In other words, there is no difference between choosing the strategy "treat all" or "treat none" in terms of regret if:
(17)
Similarly, the strategies "model" and "treat none" are equivalent in regret if:
(18)
and the strategies "model" and "treat all":
(19)
The acceptable regret,Rg
0, can be computed using any of the two definitions described in equations 15 and 16.
We can also use equations 15 and 16 to identify the prognostic probabilities at which the decision maker would not regret the decision to which he/she is committed even if that decision may prove wrong. For instance, we are typically interested in the prognostic probability above which a physician would commit to the decision to treat a patient, and the probability below which he/she would not to treat a patient without feeling undue consequences of these decisions[28]. In other words, we are looking for the probabilities for which ERg(Treat all) ≤ Rg
h
, and ERg(Treat none) ≤ Rg
b
. Solving the inequalities using equations 4, 5, 15, and 16 and after scaling Rg
0
by (U
1 - U
3), we obtain
(20)
Where P
treat all
is the prognostic probability above which the physician would tolerate giving treatment that may prove unnecessary. Similarly,
(21)
represents the prognostic probability below which the physician would comfortably withhold treatment that may prove beneficial, in retrospect.
Note that equations 20 and 21 express acceptable regret in terms of probabilities while equations 17-19 define it in terms of NERD. Hence, the outputs of these equations are not the same; rather, they complement each other.
Elicitation of acceptable regret
In most cases the decision maker does not have a complete understanding of benefits lost or harms inflicted and cannot assign a precise number to them. For this reason, we do not suggest inquiring directly about the value of r. Instead, we propose eliciting r through the decision-maker's responses to specific clinical scenarios. For example, we propose the following approach:
Assume that you have 100 patients with the same probability of disease as the patient you are currently treating . You need to decide whether each of these patients should receive treatment or not. Since no prediction model is 100% accurate, it is expected that you will make some mistakes in your treatment recommendations (e.g. you may recommend treatment to a patient who does not need it, or fail to recommend treatment to a patient who needs it).
1. We are now interested in knowing your tolerance toward administering unnecessary treatment i.e. we want to learn what the magnitude of the unavoidable error you can live with is by inflicting potentially harmful treatment on a patient. Note that if you say that your acceptable regret is zero, this means that you can only make decision if you absolutely certain that your recommendation is correct.
Out of the number (100-y) of patients who should have not received treatment, how many patients would you tolerate treating? (The answer is used to compute r
h
).
2. We are interested in knowing your tolerance toward failing to provide necessary treatment i.e. we want to learn what the magnitude of unavoidable error you can live with is by forgoing potentially beneficial treatment. Note that if you say that your acceptable regret is zero, this means that you can only make decision if you absolutely certain that your recommendation is correct.
Out of the number (100-x) of patients who should have been treated, how many patients would you tolerate not treating? (The answer is used to compute r
b
).
It is unnecessary to ask the decision maker to answer both questions. We suggest asking only the question related to the recommendation the physician is about to make e.g. if the recommendation is about administering treatment, then the decision maker should be asked the second question, while if it is about not giving treatment, then he/she can ask the first question.
The value of acceptable regret is plotted in the regret DCA graph to visually facilitate the decision making process. At a specific threshold probability all strategies for which |NERD| ≤ Rg
0 are considered equivalent in regret, according to the definition in the previous section.
Comments
View archived comments (2)