### The microsimulation model for colorectal cancer

The ColoRectal Cancer Simulated Population Incidence and Natural history model [41, 42] describes the natural history of CRC based on the adenoma-carcinoma sequence [20, 26]. Four model components describe the natural history of CRC: (1) adenoma risk; (2) adenoma growth; (3) transition from adenoma to preclinical cancer; and (4) transition from preclinical to clinical cancer (sojourn time). CRC-SPIN has been used to provide guidance to the Centers for Medicare and Medicaid Services (CMS) [51] and to inform U.S. Preventive Services Task Force CRC screening guidelines [17]. We provide an overview of CRC-SPIN 2.0 [43], which contains 22 calibrated parameters, \(\theta\), and prior distributions which are truncated normal or uniform, as informed by prior knowledge from published literature and previous calibration exercises. We refer the reader to [43] and online at cisnet.cancer.gov [28] for more detail.

#### Adenoma risk

Adenomas are assumed to arise according to a non-homogeneous Poisson process with a piecewise linear age-effect. The *i*th agent’s baseline instantaneous risk of an adenoma at age \(a=20\) years is given by \(\psi _i(20)= \exp (\alpha _{0i} + \alpha _1 \text{ female}_i\)) where \(\alpha _{0i} \sim N(A,\sigma _\alpha )\) and \(\alpha _1\) captures the difference in risk for women (female\(_i=1\) indicates agent *i* is female). Adenoma risk changes over time, generally increasing with age, a process we model using a piecewise linear function for log-risk with knots at ages 50, 60, and 70 and assuming zero risk before age 20 [18]:

$$\begin{aligned} \ln (\psi _i(a))= & {} \alpha _{0i} + \alpha _1 \text{ female}_i + \delta (a\ge 20)\min (a-20,30)\alpha _{20} \nonumber \\&+\delta (a\ge 50)\min ((a-50),10)\alpha _{50} \nonumber \\&+ \delta (a\ge 60)\min ((a-60),10)\alpha _{60} \nonumber \\&+ \delta (a\ge 70)(a-70)\alpha _{70} . \end{aligned}$$

(1)

#### Adenoma growth

For each adenoma, we simulate a hypothetical time to reach 10mm, assuming that \(t_{10mm}\) has a Frèchet distribution with shape parameter \(\beta _1\), scale parameter \(\beta _2\), and cumulative distribution function given by

$$\begin{aligned} F(t) = \exp \left[ - \left( \frac{t}{\beta _2} \right) ^{-\beta _1} \right] \end{aligned}$$

(2)

for \(t \ge 0\). We allow different scale and shape parameters for adenomas in the colon and rectum, using the notation \(\beta _{1c}\) and \(\beta _{2c}\) for the colon, and \(\beta _{1r}\) and \(\beta _{2r}\) for the rectum.

Adenoma size at any point in time is simulated using the Richard’s growth model, with a calibrated parameter that allows for a wide range of sigmoidal growth patterns [48]. The diameter of the *j*th adenoma in the *i*th agent at time *t* after initiation is given by

$$\begin{aligned} d_{ij}(t) = d_\infty \left[ 1 + \left( \left( \frac{d_0}{d_\infty }\right) ^{1/p} -1 \right) \exp (-\lambda _{ij} t) \right] ^p \end{aligned}$$

(3)

where \(d_0=1\) mm is the minimum adenoma diameter in millimeters (mm) and \(d_\infty =50\) is the maximum adenoma diameter. The calibrated parameter *p* determines the shape of the growth curve. The growth rate for the *j*th adenoma within the *i*th agent, \(\lambda _{ij}\), is calculated by setting \(t=t_{\mathrm{10\,mm}}\) and \(d_{ij}=10\) in Eq. (3).

#### Transition from adenoma to preclinical invasive cancer

For the *j*th adenoma in the *i*th agent, the size at transition to preclinical cancer (in mm) is simulated using a log-normal distribution; the underlying (exponentiated) normal distribution is assumed to have standard deviation \(\sigma _{\gamma }\) and mean

$$\begin{aligned} \mu _{ij} = \gamma _0 + \gamma _1 \text{ female}_{i} + \gamma _2 \text{ rectum}_{ij} + \gamma _3 \text{ female}_{i} \text{ rectum}_{ij} + \gamma _4 \text{ age}_{ij} + \gamma _5 \text{ age}_{ij}^2 \end{aligned}$$

(4)

where rectum\(_{ij}\) is an indicator of rectal versus colon location and age\(_{ij}\) is the age at adenoma initiation in decades, centered at 50 years. Based on this model, the probability that an adenoma transitions to preclinical cancer increases with size. Most adenomas do not reach transition size and small adenomas are unlikely to transition to cancer.

#### Sojourn time

Sojourn time is the time between the transition from preclinical (asymptomatic) CRC to clinical (symptomatic and detected) cancer. We simulate sojourn time using a Weibull distribution with survival function

$$\begin{aligned} S(t) = \exp \left( -\left( \frac{t}{\lambda _1}\right) ^{\lambda _{2}} \right) \end{aligned}$$

(5)

for preclinical cancer in the colon, and assume a proportional hazards model, with hazard ratio \(\exp (\lambda _3\text{ rectum}_{ij})\), to allow sojourn time to systematically differ for preclinical cancers in the colon and rectum.

#### Simulation of lifespan and colorectal cancer survival

Once a cancer becomes clinically detectable, we simulate stage and tumor size at clinical detection based on Surveillance, epidemiology, and end results (SEER) data from 1975 to 1979, the most recent period prior to widespread dissemination of CRC screening tests [27]. Survival time after CRC diagnosis is based on the first diagnosed CRC and depends on age, sex, cancer location, and stage, and is simulated using relative survival estimates from analysis of SEER data from individuals diagnosed with CRC from 1975 through 2003 [39]. We assume proportional hazards of CRC and other-cause mortality within sex and birth-year cohorts. Other-cause mortality is modeled using survival probabilities based on product-limit estimates for age and birth-year cohorts from the National Center for Health Statistics Databases [29].

### Calibration data

Calibration data consist of individual-level data that are reported in aggregate in published studies. Calibration targets therefore take the form of summary statistics. Because targets come from small and larger studies, as well as registry data that results in very precisely estimated targets, the level of uncertainty varies across targets. Generating calibration targets requires simulating a set of agents with risk that is similar to the study population based on age, gender, prior screening patterns, and the time period of the study, which may affect both overall and cancer-specific mortality. This simulation can be computationally demanding, depending on the number of agents and the process used to simulate the particular target. We calibrated to 40 targets from six sources: SEER registry data ([27], 20 targets) and five published studies (20 targets). Let \(y=(y_1,\dots ,y_{J})\) denote these \(J=40\) calibration targets.

SEER colon and rectal cancer incidence rates in 1975–1979 are reported per 100,000 individuals and are a key calibration target. We assumed that the number of incident CRC cases in any year follows a binomial distribution with number of trials equal to the SEER population size. To simulate SEER incidence rates, we generated a population of individuals from aged 20 to 100 who are free from clinically detected CRC, with an age- and sex-distribution that matches the SEER 1978 population. Model-predicted CRC incidence is based on the number of people who develop CRC in the next year.

To simulate additional targets from published studies, we generated separate populations for each study that match the age and gender distribution of the sample during the time-period of the study. We assume that study participants are free from symptomatic (clinically detectable) CRC and have not been screened for CRC prior to the study. This is a reasonable assumption because studies used for model calibration were conducted prior to widespread screening, or were based on minimally screened samples.

Simulation of targets also requires simulating the detection and removal of adenomas and preclinical cancers. The probability of detection, or test sensitivity, is a function of lesion size, and is informed by back-to-back colonoscopy studies [10, 38]. We specify the probability of not detecting (or missing) an adenoma of size *s* that produces miss rates that are consistent with observed findings [10, 38] and were successfully used in [43].

The miss rate functions result in sensitivities of 0.81 for adenomas of 5mm, 0.92 for adenomas of 10mm, and 0.98 for adenomas of 15mm. For preclinical cancers, we assume sensitivity that is the maximum of 0.95 and sensitivity based on adenoma size, so that colonoscopy sensitivity is 0.95 for preclinical cancers 12mm or smaller, and sensitivity is greater than 0.95 for larger preclinical cancers.

### Calibration results and motivation for recalibration

A Bayesian approach to model calibration is preferred in this context to enable uncertainty quantification in model parameters and because of the ability to incorporate different sources of information through calibration targets and prior distributions. We use the Incremental Mixture Approximate Bayesian Computation (IMABC) algorithm developed by [43] and used successfully to calibrate the CRC-SPIN model with 22 parameters and 40 targets. We refer the reader to the Additional file 1 for more description of the algorithm. This algorithm generates a sample of MSM parameter draws that are from an approximate posterior distribution. This approach is an approximate Bayesian version of adaptive importance sampling [36, 47] and similar to the Population Monte Carlo ABC algorithm of Beaumont et al. [1]. While While McKinley et al. [24] showed that popular ABC methods can be inefficient or fail to converge when applied to complex, high-dimensional models (in their example there are 22 parameters and 18 outputs, which we refer to here as targets), the Incremental Mixture Approximate Bayesian Computation (IMABC) algorithm successfully calibrated the CRC-SPIN model with 22 parameters and 40 targets.

The application of IMABC in the experiments described here was implemented using the EMEWS framework [31] and run on the Midway2 cluster at the University of Chicago Research Computing Center and the Bebop cluster at Argonne National Laboratory. To take advantage of parallel processing, we used 80 node job allocations to execute up to 318 concurrent CRC-SPIN instances.

Internal model validation indicated that, as expected, the model simulated (predicted) targets were within the tolerance intervals of the observed calibration targets. However, the posterior distributions for two parameters of the sojourn time (time spent in the preclinical cancer phase) distribution largely reflected the prior distributions [43], and suggested too little time spent in the preclinical cancer phase. This finding was also noted by Rutter et al. [40] for an earlier version of the CRC-SPIN model, based on the results of a model validation, and comparison to other CRC models [16, 49].

We hypothesize that sojourn time is not well estimated because sojourn time is informed by screening studies, and our targets include only a single calibration target from a screening study that is imprecise and therefore has extremely wide tolerance intervals [11]. Sojourn time and preclinical cancer detection rates are closely related, as a longer sojourn time implies more time to detect preclinical cancers. In light of this finding, we sought to include an additional calibration target that would provide more information about sojourn time.