The `escalation`

package by Kristian Brock. Documentation
is hosted at https://brockk.github.io/escalation/

Dose-finding clinical trials investigate experimental agents, seeking
doses that are tolerable and active. They commonly use some
dose-selection model to analyse binary outcomes at increasing doses in
cohorts of patients. It is possible to analyse the behaviour of a
dose-finding design by exhaustively calculating every possible set of
outcomes and invoking the dose selection model on each. We will refer to
these hypothetical sequences of doses in response to outcomes as
*dose-paths*.

An example will make this clear. The 3+3 design is widely understood.
One of its benefits is its simplicity. Let us investigate the first two
cohorts of three patients in the 3+3 variant that permits de-escalation.
We calculate dose-paths in `escalation`

using the
`get_dose_paths`

function:

```
library(escalation)
#> Loading required package: magrittr
```

```
paths <- get_three_plus_three(num_doses = 5, allow_deescalate = TRUE) %>%
get_dose_paths(cohort_sizes = c(3, 3))
```

`get_dose_paths`

takes a dose-selection methodology and
enumerates every possible path according to the cohort sizes that you
specify. The returned `paths`

object contains lot of
information but the most pertinent perhaps is the sequence of
dose-recommendations in response to hypothetical outcomes:

```
paths
#> Give dose 1:
#> NNN -> 2
#> NNN -> 3
#> NNT -> 2
#> NTT -> 1
#> TTT -> 1
#> NNT -> 1
#> NNN -> 2
#> NNT -> NA
#> NTT -> NA
#> TTT -> NA
#> NTT -> NA
#> TTT -> NA
```

We see above that the trial starts at dose 1. If three non-toxicities
(N) are seen, the algorithm advocates escalation to dose 2. In constrast
if two or more toxicities (T) are seen in the first cohort, no dose is
selected (the dose is `NA`

) and the trial stops. Subsequent
cohorts are represented by greater levels of indentation.

This information is better represented by a graph. If you are using
RStudio, the graph will appear in the *Viewer* pane. (In this
vignette, we suppress non-RStudio demonstration of the graphs to avoid
problems on CRAN computers.)

```
if(Sys.getenv("RSTUDIO") == "1") {
graph_paths(paths)
}
```

The blue node towards the centre bearing the number `1`

represents the start of the trial. The edges of the graph (the lines
connecting the nodes) represent the outcomes in cohorts of patients. As
before, we see that if three `N`

events are seen, the trial
escalates to dose 2. Three patients are assumed to be treated at that
dose and from there, further escalation to dose 3 will be advised if no
toxicity is seen. In contrast, dose 2 will be retained for a further
cohort of three if a single toxicity is seen, and de-escalation to dose
1 will occur if two or more toxicities are seen. This is exactly the
behaviour we expect from the 3+3 design.

Dose-paths were introduced in the phase I setting that we consider here by Yap et al. (2017). Their phase I/II analogue was introduced by Brock et al. (2017) for dose-finding trials that consider efficacy and toxicity outcomes.

The above example uses the 3+3 design but other methods are
available. In fact, early phase statisticians would much prefer that you
use a model-based method (Wheeler et al.
2019). The `escalation`

package is intentionally
written so that all dose selectors look the same, regardless the
implementation-level details. In computing terminology, dose selectors
in `escalation`

support a common interface. This makes it
trivial to calculate dose-paths for all dose-selectors - we just use the
`get_dose_paths`

function again.

Let us investigate a continual reassessment method (CRM) design now, as described by O’Quigley, Pepe, and Fisher (1990) and implemented by the dfcrm package (Cheung 2013). We must specify a dose-toxicity skeleton, and a target toxicity level:

```
skeleton <- c(0.05, 0.1, 0.25, 0.4, 0.6)
target <- 0.25
```

We can then specify a CRM model and calculate dose-paths as before:

```
paths <- get_dfcrm(skeleton = skeleton, target = target) %>%
get_dose_paths(cohort_sizes = c(3, 3))
```

When we graph the paths this time (and adopt a different colour palette for fun):

```
if(Sys.getenv("RSTUDIO") == "1") {
graph_paths(paths, viridis_palette = 'magma')
}
```

we see that the `Stop`

node is absent - all paths
recommend a dose, even those seeing substaantial toxicity. Stopping
behaviours must be specified for the CRM method. Fortunately,
`escalation`

makes this simple.

An intuitive method for stopping in the Bayesian setting is to test the posterior distribution for the probability of toxicity. In our example, we will stop if there is at least a 90% probability that the toxicity rate at the lowest dose is 35% or greater:

```
paths <- get_dfcrm(skeleton = skeleton, target = target) %>%
stop_when_too_toxic(dose = 1, tox_threshold = 0.35, confidence = 0.9) %>%
get_dose_paths(cohort_sizes = c(3, 3))
```

When we visualise the paths from the updated model:

```
if(Sys.getenv("RSTUDIO") == "1") {
graph_paths(paths, viridis_palette = 'inferno')
}
```

we see that some paths once again recommend stopping. The paths that recommend a dose are otherwise unchanged.

Another dose-escalation model implemented in `escalation`

is the Bayesian optimal interval (BOIN) method by Liu and Yuan (2015), implemented in the
`BOIN`

package (Yuan and Liu
2018).

To spice things up, we will visualise how this model behaves over four cohorts of two patients:

```
paths <- get_boin(num_doses = 4, target = target) %>%
get_dose_paths(cohort_sizes = rep(2, 4))
#> You have requested 121 model evaluations. Be patient.
```

```
if(Sys.getenv("RSTUDIO") == "1") {
graph_paths(paths, RColorBrewer_palette = 'YlOrRd')
}
```

The first thing to note about the graph above is that it is much more complex than the previous graphs. The number of nodes in dose-paths increases faster than linearly as more cohorts are considered. Consideration of this must be given when visualising dose-paths.

Secondly, the visual method makes it simple to discern messages about future model behaviour. For instance, we can easily see that dose 4, the darkest red node, is only reached in the first four cohorts if the first cohort sees no toxicity.

In contrast to CRM, BOIN does have a stopping rule for excess
toxicity built in. We see that `TT`

in the first cohort is
not enough to advocate stopping but that any further toxicity in the
second cohort will be sufficient to warrant stopping.

There is no reason that the cohort sizes should be uniform. Specify
whatever `cohort_sizes`

you like:

```
paths <- get_boin(num_doses = 4, target = target) %>%
get_dose_paths(cohort_sizes = c(3, 1, 2))
if(Sys.getenv("RSTUDIO") == "1") {
graph_paths(paths, RColorBrewer_palette = 'Blues')
}
```

You can even evaluate the dose advice after each patient using cohort sizes of 1:

```
paths <- get_boin(num_doses = 4, target = target) %>%
get_dose_paths(cohort_sizes = rep(1, 4))
if(Sys.getenv("RSTUDIO") == "1") {
graph_paths(paths, RColorBrewer_palette = 'RdPu')
}
```

It is possible to calculate dose-paths from trials that are partially
completed. For instance, let us continue with our BOIN model and assume
that we have seen outcomes `1NNN 2TNT`

so far in our trial.
Thus, we are reasonably sure that dose 1 is safe but wary that dose 2
might be too toxic. However, these beliefs are tempered by the tiny
sample size. From this starting point, how might the next two cohorts of
three proceed? We just specify the previous outcomes using the
`previous_outcomes`

parameter:

```
paths <- get_boin(num_doses = 4, target = target) %>%
get_dose_paths(cohort_sizes = rep(3, 2), previous_outcomes = '1NNN 2TNT')
if(Sys.getenv("RSTUDIO") == "1") {
graph_paths(paths, viridis_palette = 'viridis')
}
```

Notice how this is different to the advice we get at the start of the
trial (i.e. when omitting the `previous_outcomes`

):

```
paths <- get_boin(num_doses = 4, target = target) %>%
get_dose_paths(cohort_sizes = rep(3, 2))
if(Sys.getenv("RSTUDIO") == "1") {
graph_paths(paths, viridis_palette = 'viridis')
}
```

This is a feature of model-based methods like CRM and BOIN - they use all information at all doses when making dose decisions. They are not memoryless like the 3+3.

The dose at which dose-paths commence is inferred from the model. It can be specified manually, however:

```
paths <- get_three_plus_three(num_doses = 5, allow_deescalate = TRUE) %>%
get_dose_paths(cohort_sizes = c(3, 3), next_dose = 3)
if(Sys.getenv("RSTUDIO") == "1") {
graph_paths(paths, viridis_palette = 'plasma')
}
```

A 3+3 trial with de-escalation enabled will de-escalate through the doses when toxicity is seen, as expected.

Calculating dose-paths is useful for visually examining the
conditions under which a dose-finding design would escalate or
de-escalate or stop. However, that is only part of the story. When we
marry dose-paths with assumed true event probabilities, we can calculate
exact operating characteristics of a design. We refer to this as
*crystallising* dose paths because the likelihood of each path
has been calculated precisely according to an assumed truth.

For instance, we can calculate dose-paths for the first four cohorts of three patients using the CRM with stopping design that we specified previously:

```
skeleton <- c(0.05, 0.1, 0.25, 0.4, 0.6)
target <- 0.25
paths <- get_dfcrm(skeleton = skeleton, target = target) %>%
stop_when_too_toxic(dose = 1, tox_threshold = 0.35, confidence = 0.9) %>%
get_dose_paths(cohort_sizes = rep(3, 4))
#> You have requested 341 model evaluations. Be patient.
```

We can then crystallise the paths using toxicity probabilities that exactly match the beliefs in our skeleton:

```
true_prob_tox <- skeleton
x <- paths %>% calculate_probabilities(true_prob_tox = true_prob_tox)
x
#> Number of nodes: 213
#> Number of terminal nodes: 160
#>
#> Number of doses: 5
#>
#> True probability of toxicity:
#> 1 2 3 4 5
#> 0.05 0.10 0.25 0.40 0.60
#>
#> Probability of recommendation:
#> NoDose 1 2 3 4 5
#> 0.000127 0.019866 0.227135 0.451791 0.273014 0.028066
#>
#> Probability of continuance:
#> [1] 1
#>
#> Probability of administration:
#> 1 2 3 4 5
#> 0.3172 0.1602 0.1867 0.2734 0.0626
#>
#> Expected sample size:
#> 11.99887
#>
#> Expected total toxicities:
#> 2.705489
```

We see that in this scenario, the probability of a path stopping and advocating no dose within the first four cohorts is very close to zero. In contrast, when the toxicity probabilities are much greater than anticipated:

```
true_prob_tox <- c(0.45, 0.6, 0.68, 0.75, 0.81)
x <- paths %>% calculate_probabilities(true_prob_tox = true_prob_tox)
x
#> Number of nodes: 213
#> Number of terminal nodes: 160
#>
#> Number of doses: 5
#>
#> True probability of toxicity:
#> 1 2 3 4 5
#> 0.45 0.60 0.68 0.75 0.81
#>
#> Probability of recommendation:
#> NoDose 1 2 3 4 5
#> 0.3027462 0.6407708 0.0502073 0.0055527 0.0006931 0.0000299
#>
#> Probability of continuance:
#> [1] 0.697
#>
#> Probability of administration:
#> 1 2 3 4 5
#> 0.895800 0.052921 0.008539 0.042028 0.000711
#>
#> Expected sample size:
#> 10.73269
#>
#> Expected total toxicities:
#> 5.102913
```

the probability of selecting no dose within these first four cohorts
is over 30%. The information labelled
`Probability of continuance:`

shows the aggregate probability
of paths that are continuing to advocate experimentation at doses,
i.e. those paths that do not advocate stopping.

We see above that the probability of continuance is 1 minus the
probability of selecting no dose. However, this need not necessarily be
the case because paths can advocate stopping *and recommend a
dose*. They might do so once it is felt that a suitable dose has
been identified. To make this point, let us imagine that we add a rule
to our above design that allows stopping once there are 9 patients
allocated to the recommended dose. We can think of this as stopping for
consensus. Respecifying our model and recalculating dose-paths, we
have:

```
paths <- get_dfcrm(skeleton = skeleton, target = target) %>%
stop_when_too_toxic(dose = 1, tox_threshold = 0.35, confidence = 0.9) %>%
stop_when_n_at_dose(dose = 'recommended', n = 9) %>%
get_dose_paths(cohort_sizes = rep(3, 4))
#> You have requested 341 model evaluations. Be patient.
```

```
x <- paths %>% calculate_probabilities(true_prob_tox = true_prob_tox)
x
#> Number of nodes: 141
#> Number of terminal nodes: 106
#>
#> Number of doses: 5
#>
#> True probability of toxicity:
#> 1 2 3 4 5
#> 0.45 0.60 0.68 0.75 0.81
#>
#> Probability of recommendation:
#> NoDose 1 2 3 4 5
#> 0.2103223 0.7393498 0.0440522 0.0055527 0.0006931 0.0000299
#>
#> Probability of continuance:
#> [1] 0.137
#>
#> Probability of administration:
#> 1 2 3 4 5
#> 0.895800 0.052921 0.008539 0.042028 0.000711
#>
#> Expected sample size:
#> 9.064864
#>
#> Expected total toxicities:
#> 4.352391
```

We see that this has reduced our probability of stopping for excess toxicity and inflated our chances of recommending dose 1. Notably, the probability of continuance is now roughly 14%, suggesting that most paths have advocated stopping by now, either for toxicity or concensus.

If we do not like that performance, we can make the consensus stopping rule more demanding by requesting 12 at the recommended dose to advocate stopping:

```
paths <- get_dfcrm(skeleton = skeleton, target = target) %>%
stop_when_too_toxic(dose = 1, tox_threshold = 0.35, confidence = 0.9) %>%
stop_when_n_at_dose(dose = 'recommended', n = 12) %>%
get_dose_paths(cohort_sizes = rep(3, 4))
#> You have requested 341 model evaluations. Be patient.
```

```
x <- paths %>% calculate_probabilities(true_prob_tox = true_prob_tox)
x
#> Number of nodes: 213
#> Number of terminal nodes: 160
#>
#> Number of doses: 5
#>
#> True probability of toxicity:
#> 1 2 3 4 5
#> 0.45 0.60 0.68 0.75 0.81
#>
#> Probability of recommendation:
#> NoDose 1 2 3 4 5
#> 0.3027462 0.6407708 0.0502073 0.0055527 0.0006931 0.0000299
#>
#> Probability of continuance:
#> [1] 0.24
#>
#> Probability of administration:
#> 1 2 3 4 5
#> 0.895800 0.052921 0.008539 0.042028 0.000711
#>
#> Expected sample size:
#> 10.73269
#>
#> Expected total toxicities:
#> 5.102913
```

As usual, deriving an acceptable design is an iterative process. The
tools in `escalation`

make it easier to arrive at a design
that performs how you want.

Combining dose-paths with true event probabilities allows
probabilistic inference on dose-finding designs. This is a novel
extension to the use advocated by Yap et al.
(2017) and Brock et al. (2017). The
use of exact operating characteristics has been implemented for the 3+3
design in the `bcrm`

package (Sweeting
and Wheeler 2019). We generalise the method here.

We saw above that probabilistic inference was possible with
dose-paths. Researchers have typically used simulation to achieve this
task. `escalation`

supports simulation as well through the
`simulate_trials`

function. However, this does raise the
question, when should you use each method, and what are their relative
merits?

The answer to the first question comes down to the expected number of
model fits required. Fitting dose-finding models takes computer time. In
dose-paths, the model is fit once at each node. We have seen examples
above of how the number and size of cohorts affects the number of nodes
in dose-paths. In fact, `escalation`

provides a function to
calculate the number of nodes.

In a phase 1 dose-finding trial, each patient experiences exactly one
of two outcomes: `T`

or `N`

. Let us calculate how
many nodes there are in a graph of dose-paths using five cohorts of
three patients. We run:

```
num_dose_path_nodes(num_patient_outcomes = 2, cohort_sizes = rep(3, 5))
#> [1] 1 4 16 64 256 1024
```

The `num_patient_outcomes = 2`

parameter reflects that
patients may experience `T`

or `N`

. The returned
vector of integers is the number of nodes at each depth. There is one
starting node. That node is connected to four children via outcomes
`NNN`

, `NNT`

, `NTT`

, and
`TTT`

. The number of nodes at greater depths proceeds
multiplicatively thereafter. The total number of nodes is:

```
num_dose_path_nodes(num_patient_outcomes = 2, cohort_sizes = rep(3, 5)) %>%
sum
#> [1] 1365
```

Thus it requires exactly 1,365 model fits to calculate the exact operating characteristics in this \(n = 5 \times 3 = 15\) patient scenario. To compare this to simulations, consider that each simulated trial iteration will fit the model up to five times, once at the end of each cohort. The total number of model fits in a simulation study is bounded by this number multiplied by the number of simulated iterations. Generally simulation studies use thousands of replicates, so it is easy to see that exact inference via crystallised dose-paths will be much less computaionally burdensome, and therefore faster here.

In contrast, now consider a trial of eight cohorts of three. The total number of model fits to enumerate the complete graph of dose-paths is

```
num_dose_path_nodes(num_patient_outcomes = 2, cohort_sizes = rep(3, 8)) %>%
sum
#> [1] 87381
```

Ten thousand simulated iterations of 8 cohorts each would only require up to 80,000 model fits. Thus, a reasonably accurate simulation study would be expected to be faster here.

However, speed is not the only concern: there is also precision to consider. Simulations have the disadvantage of suffering from Monte Carlo error, that is the uncertainty about the estimated statistics arising from the use of a finite number of simulated iterations. In contrast, exact inference via dose-paths has the great advantage of being exact. That is, there is no uncertainty in the calculated probabilities. (Note: there is still uncertainty about which path will be taken because that is determined by random patient outcomes). Thus, there are scenarios when dose-paths may still be preferable to simulations, even when they are expected to take longer.

It is likely that in practice, simulation is used when dose-paths
would be a better option. If true, that would likely be linked to
provision of software that performs the two methods.
`escalation`

plugs that gap.

Brock, Kristian, Lucinda Billingham, Mhairi Copland, Shamyla Siddique,
Mirjana Sirovica, and Christina Yap. 2017. “Implementing the
EffTox Dose-Finding Design in the Matchpoint
Trial.” *BMC Medical Research Methodology* 17 (1): 112. https://doi.org/10.1186/s12874-017-0381-x.

Cheung, Ken. 2013. *Dfcrm: Dose-Finding by the Continual Reassessment
Method*. https://CRAN.R-project.org/package=dfcrm.

Liu, Suyu, and Ying Yuan. 2015. “Bayesian Optimal Interval Designs
for Phase I Clinical Trials.” *Journal of the
Royal Statistical Society: Series C (Applied Statistics)* 64 (3):
507–23. https://doi.org/10.1111/rssc.12089.

O’Quigley, J, M Pepe, and L Fisher. 1990. “Continual Reassessment
Method: A Practical Design for Phase 1 Clinical Trials in
Cancer.” *Biometrics* 46 (1): 33–48. https://doi.org/10.2307/2531628.

Sweeting, Michael, and Graham Wheeler. 2019. *Bcrm: Bayesian
Continual Reassessment Method for Phase i Dose-Escalation Trials*.
https://CRAN.R-project.org/package=bcrm.

Wheeler, Graham M., Adrian P. Mander, Alun Bedding, Kristian Brock,
Victoria Cornelius, Andrew P. Grieve, Thomas Jaki, et al. 2019.
“How to Design a Dose-Finding Study Using the Continual
Reassessment Method.” *BMC Medical Research Methodology*
19 (1). https://doi.org/10.1186/s12874-018-0638-z.

Yap, Christina, Lucinda J. Billingham, Ying Kuen Cheung, Charlie
Craddock, and John O’Quigley. 2017. “Dose Transition Pathways:
The Missing Link Between Complex Dose-Finding Designs and
Simple Decision-Making.” *Clinical Cancer Research* 23
(24): 7440–47. https://doi.org/10.1158/1078-0432.CCR-17-0582.

Yuan, Ying, and Suyu Liu. 2018. *BOIN: Bayesian Optimal INterval
(BOIN) Design for Single-Agent and Drug- Combination Phase i Clinical
Trials*. https://CRAN.R-project.org/package=BOIN.