18 April 2023
Imagine a trial with the following planning assumptions:
We plan to conduct an interim analysis when half the patients have been observed using an O’Brien-Fleming-like spending function.
The desired power at interim for true effect \(\delta = \tilde\delta\) and standard deviation \(\sigma=\tilde\sigma\) is \(1-\tilde{\beta}_{IA} =\) 0.2525 and the desired overall power is (approximately) \(1-\tilde{\beta}\) =0.9.
Note that in this example, the sample size for the two-stage design is approximately the same as for the single-stage design.
Inclusion criteria (among other things)
However, test outcome only known after randomization. Number of uneligible patients unknown (estimates between 20% and 80%!)
Concerns for Power
The true power depends not just on the true values for the effect size and the standard deviation but also on the realized sample sizes.
For a fixed design, the resulting power (for true effect \(\delta = \tilde\delta\) and standard deviation \(\sigma=\tilde\sigma\)) depending on realized sample sizes \(n_1\) and \(n_2\) is given by: \[\begin{align} 1-\beta =& 1-\Phi\left(\Phi^{-1}(1-\alpha) - (\Phi^{-1}(1-\alpha) + \Phi^{-1}(1-\tilde\beta)) \sqrt{\frac{n_1n_2}{\tilde n_1 \tilde n_2}}\right) \text{ with } f = \frac{n_1n_2}{\tilde n_1 \tilde n_2} \end{align}\]
Note that \(f\) is independent of \(k\). It is an expression of how much the realized sample sizes deviate from the planned sample sizes (that would yield a power of \(1-\tilde{\beta}\)).
Randomization is a key element of randomized controlled trials (RCT). It reduces possible systematic bias of the treatment effect from confounding variables.
Several methods for randomization exists. In general, methods with low predictability of treatment assignments that, at the same time, lead to balanced sample sizes are preferable. In reality, often a compromise between the two features has to be made.
Some examples:
Complete randomization corresponds to tossing a fair coin at each treatment assignment for balanced allocation in the two-group case.
The Big Stick Design (BSD) is a variation of complete randomization with the aim to restrict the imbalance.
BSD adds the restriction that the treatment imbalance \(D_j\) at a given allocation step \(j\) in either direction must not be larger than a pre-defined value for the \(mti\). This ensures that the imbalance is smaller than the \(mti\) across the whole trial.
The procedure is defined as follows:
Accordingly, \(|D_{j}|\leq mti\) at all allocation steps \(j=1,...,\tilde{N}\), and the procedure corresponds to complete randomization with a reflecting barrier at \(-mti\) and \(+mti\).
The \(mti\) is often chosen in a rather arbitrary way. One of our first aims was to connect the \(mti\) to an acceptable loss in power.
As the \(mti\) is a fixed number, our second aim was to allow the \(mti\) to be chosen dependent on the number of currently enrolled patients.
In a group-sequential design, power does also matter at the interim analysis \(\rightarrow\) stricter control of \(mti\) might be required for stage I of the trial.
A desired minimal power for the interim analysis can be guaranteed by adapting the \(mti\) for stage I (here: planned power is 0.2525, minimally desired power is 0.2508).
After the interim analysis, the \(mti\) is defined as for the standard FSD.
In case a spending function is used (e.g. OBF-type alpha spending), one may increase the flexibility by allowing a \(mti\) that ensures the desired minimally power regardless of when the interim analysis is conducted
Besides imbalance and power, randomness is another important issue for a randomization procedure.
The Forcing Index (FI) has been proposed as a measure that ranges between 0 (\(\hat{=}\) “high randomness”) and 1 (\(\hat{=}\)“low randomness”).
If \(\phi_i\) denotes the probability that a patient is allocated to the experimental treatment given all previous allocations, then \(FI(i)=\sum_{j=1}^i \frac{E[|\phi_j-0.5|]}{i/4}\), where \(E [|\phi_j-0.5|]\) is the expected deviation of the conditional probability of allocating the experimental treatment from the unconditional target value of 0.5 (Berger et al. 2021).
This means that for CR, we have \(FI(i)\equiv 0\) for all \(i\geq 1\).
For PBD with block size 2, \(FI(i)\rightarrow 1\) for \(i \rightarrow \infty\).
In order to jointly compare balance and predictability, we also assess imbalance via a measure that can take values between 0 (\(\hat{=}\) “low imbalance”) and 1 (\(\hat{=}\)“high imbalance”).
Here, we take the cumulative average loss after \(n\) allocations, which is defined as \(Imb(n)=\frac{1}{n}\sum_{i=1}^n E[D_i^2]/i\), where \(D_i\) is the difference between the allocations in the two treatment groups (Berger et al. 2021).
If \(n\rightarrow\infty\), \(Imb(n)\) converges to 1 for CR.
\(Imb(n)\) converges to 0 for PBD with block length 2.
Berger, V., Bour, L., Carter, K. et al. A roadmap to using randomization in clinical trials. BMC Med Res Methodol 21, 168 (2021). https://doi.org/10.1186/s12874-021-01303-z