The Continual Reassessment Method (CRM) is the elder statesman of adaptive clinical trials. Originally published by O’Quigley, Pepe, and Fisher (1990), it assumes a smooth mathematical form for the dosetoxicity curve to conduct a dosefinding trial seeking a maximum tolerable dose. It is a truly seminal design, with many variants appearing over the years to handle different clinical scenarios, such as seperate groups, lateonset toxicity, efficacy and toxicity outcomes, and more.
In this vignette, we focus on the original Bayesian MTDseeking design in one homogeneous group. Let \(x_i\) be a standardised doselevel. The probability of doselimiting toxicity (DLT) at dose \(x_i\) is estimated to be \(F(x_i, \theta)\), where \(F\) is a smooth mathematical function, and \(\theta\) is a general vector of parameters. Even within this simple scenario, there are variants that use different forms for \(F\), and different distributions on \(\theta\).
The CRM variants currently implemented in trialr
for MTDfinding in a single homogeneous group are:
Slot in trialr::stanmodels

Link Function  Parameters 

CrmEmpiricNormalPrior 
\(F(x_i, \beta)\) = \(x_i^{\exp{\beta}}\)  normal prior on \(\beta\); 
CrmOneParamLogisticNormalPrior 
\(F(x_i, \beta) = 1 / (1 + \exp{(a_0  \exp{(\beta)} x_i}))\)  \(a_0\) fixed and a normal prior on \(\beta\) 
CrmOneParamLogisticGammaPrior 
\(F(x_i, \beta) = 1 / (1 + \exp{(a_0  \beta x_i)})\)  \(a_0\) fixed and a gamma prior on \(\beta\) 
CrmTwoParamLogisticNormalPrior 
\(F(x_i, \alpha, \beta) = 1 / (1 + \exp{(\alpha  \exp{(\beta)} x_i)})\)  normal priors on \(\alpha\) and \(\beta\) 
These models are compiled when you install trialr
and stored in the stanmodels
object. For instance, to fit the empiric model you would use stanmodels$CrmEmpiricNormalPrior
. Examples are given below.
Where necessary, the slope parameter \(\beta\) is exponentiated to ensure a positive value. One of the core assumptions of the CRM and doseescalation methods in general, is that the probability of toxicity increases with dose. If a negative value for \(\beta\) is implausible, it makes sense to reflect this in the model. In the model with gamma prior, this step is not necessary because the prior constrains the values to be positive.
Other CRM variants could easily be added in future, if requested. For instance, Cheung (2011) defines a model that uses the hyperbolic tangent link function, but we have not implemented it here because we have never required it.
Let’s conduct some examples.
To access the Stan implementations of the CRM, we must load trialr
:
For illustration, we consider a scenario where we investigate 5 doselevels seeking the dose with Pr(DLT) closest to 25%. We will assume that our skeleton (i.e. our prior belief of the dosetoxicity curve) is
Say that we have treated 6 patients at 3 doselevels:
Patient  Cohort  Doselevel  DLT 

1  1  2  0 
2  1  2  0 
3  2  3  0 
4  2  3  0 
5  3  4  1 
6  3  4  1 
These data can be conveyed using our scheme for representing phase I outcomes. We use integers to represent doselevels given and the letters T and N to represent toxicity or notoxicity for individual patients. These clusters can be strewn together to represent the outcomes of cohorts of patients. For instance, the outcomes above can be represented:
We will update each of the CRM models implemented in trialr
for this scenario. We then briefly compare the inferences at the end.
The empiric model requires only that a prior normal standard deviation be specified for the slope parameter. We use a prior variance of \(\beta\) equal to 1.34, as used by Cheung in his examples.
mod1 < stan_crm(outcomes, skeleton = skeleton, target = target,
model = 'empiric', beta_sd = sqrt(1.34), seed = 123)
mod1
In this demonstration, we set the seed for reproducibility when sampling however, in the wild, you might not do this.
We can extract the samples of the posterior probability of DLT at each dose, stored as the series of parameters called prob_tox
. From there, all manner of inference is possible. For example, we can recreate the posterior expected probability of DLT at each dose shown above. It is simply the mean of each sample:
prob_tox_samp1 < as.data.frame(mod1, 'prob_tox')
prob_tox1 < colMeans(prob_tox_samp1)
prob_tox1
#> prob_tox[1] prob_tox[2] prob_tox[3] prob_tox[4] prob_tox[5]
#> 0.1081169 0.2159618 0.3098591 0.4444842 0.6235105
This confirms that doselevel 2 is closest to our toxicity target of 25%. Having posterior samples is very liberating from an inferential point of view. For example, it is trivial to calculate the probability that the toxicity rate exceeds the target DLT probability at each dose:
prob_too_toxic1 < unname(colMeans(prob_tox_samp1 > target))
prob_too_toxic1
#> [1] 0.11700 0.35725 0.60075 0.86500 0.99150
Even at this early stage, we are quite sure that the top dose is too toxic. This type of calculation is trivial because we have access to the posterior samples.
This model requires that a fixed intercept value and parameters for the normal prior on the slope term are specified:
mod2 < stan_crm(outcomes, skeleton = skeleton, target = target,
model = 'logistic', a0 = 3, beta_mean = 0, beta_sd = sqrt(1.34),
seed = 123)
mod2
Once again, we see that doselevel 2 would be recommended for the next cohort because it is forecast to have rob(DLT) closest to the target, 25%.
This model requires that a fixed intercept value and parameters for the gamma prior on the slope term are specified:
mod3 < stan_crm(outcomes, skeleton = skeleton, target = target,
model = 'logistic_gamma',
a0 = 3, beta_shape = 1, beta_inverse_scale = 1,
seed = 123)
mod3
This use of a Gamma(1, 1) prior is the same as an Exponential(1) prior.
This model requires that parameters for normal priors on the intercept and slope term are specified:
mod4 < stan_crm(outcomes, skeleton = skeleton, target = target,
model = 'logistic2',
alpha_mean = 0, alpha_sd = 1, beta_mean = 0, beta_sd = 1,
seed = 123)
mod4
These estimates look different to the other methods, not least because we have estimated an extra parameter. We compare the inferences in the next section.
Let’s compare the posterior estimates of the dosetoxicity curve:
post_curves < data.frame(
DoseLevel = 1:length(skeleton),
Empiric = mod1$prob_tox,
Logit1N = mod2$prob_tox,
Logit1G = mod3$prob_tox,
Logit2N = mod4$prob_tox
)
knitr::kable(post_curves, digits = 2)
DoseLevel  Empiric  Logit1N  Logit1G  Logit2N 

1  0.11  0.11  0.12  0.06 
2  0.22  0.23  0.23  0.14 
3  0.31  0.32  0.32  0.23 
4  0.44  0.45  0.45  0.43 
5  0.62  0.62  0.62  0.70 
It is perhaps no surprise that the three singleparameter models (Empiric
, Logit1N
, & Logit1G
) provide very similar estimates and agree that the dose closest to the target is doselevel 2. In contrast, the twoparameter model estimates a more convex curve, and proposes doselevel 3 for the next patient(s). Let’s have a look at that graphically:
post_curves_tall < data.frame(
DoseLevel = rep(1:length(skeleton), times = 4),
ProbTox = c(mod1$prob_tox, mod2$prob_tox, mod3$prob_tox, mod4$prob_tox),
Model = rep(c('Empiric', 'Logit1N', 'Logit1G', 'Logit2N'), each = 5)
)
library(ggplot2)
ggplot(post_curves_tall, aes(x = DoseLevel, y = ProbTox, group = Model, col = Model)) +
geom_line(size = 1.2) +
geom_hline(yintercept = target, col = 'red', linetype = 'dashed')
If we suspect that the difference in inference might be driven by our prior for \(\alpha\) in the twoparameter model, we can look at a summary of the posterior samples for that parameter:
mean  se_mean  sd  2.5%  25%  50%  75%  97.5%  n_eff  Rhat  

alpha  0.4  0.02  0.83  1.22  0.16  0.39  0.95  2.09  1601.19  1 
The posterior mean is not near the tail of the \(N(0, 1)\) prior distribution and the posterior 95% credible interval has relocated from prior guess of (2, 2). The prior on \(\alpha\) does not appear to be unduly driving the divergence from the oneparameter models.
The traditional method of inference using the CRM has been to estimate the expected dosetoxicity curve and infer the suggested dose by identifying which is estimated to have toxicity rate closest to the target.
However, with Stan we can take a different approach. We have many possible realities sampled from the posterior distributions. For instance, each Stanfit samples many possible dosetoxicity curves, in accordance with the model and priors specified and the data observed. With each sampled dosetoxicity curve we can calculate the dose that is closest to the target. Each sampled curve advocates a single dose. We can then tabulate the number of times that each dose is chosen, effectively creating the posterior probability that each dose is the MTD. This is probably what we really want when we conduct an MTDseeking clinical trial.
knitr::kable(
data.frame(
DoseLevel = 1:length(skeleton),
ProbMTD_Empiric = as.numeric(mod1$prob_mtd),
ProbMTD_Logit2N = as.numeric(mod4$prob_mtd)
), digits = 2
)
DoseLevel  ProbMTD_Empiric  ProbMTD_Logit2N 

1  0.21  0.11 
2  0.27  0.17 
3  0.27  0.41 
4  0.21  0.29 
5  0.04  0.03 
The empiric model (model 1) is actually relatively ambivalent between the first four doselevels, assigning each roughly the same probability of being the MTD. The other oneparameter models are similar (not shown). In contrast, the twoparameter model is actually more confident in its choice of doselevel 3. However, none of the models is particularly confident and this should not surprise us as we have only observed 6 patients.
There are many vignettes illuminating the CRM in trialr
:
Cheung, Ying Kuen. 2011. Dose Finding by the Continual Reassessment Method. New York: Chapman & Hall / CRC Press.
O’Quigley, J, M Pepe, and L Fisher. 1990. “Continual reassessment method: a practical design for phase 1 clinical trials in cancer.” Biometrics 46 (1): 33–48. https://doi.org/10.2307/2531628.