Experimental approaches in biology tend to fall within broader conceptual frameworks that guide the logic of the experimental design. Each of these frameworks carries both a cost and some expected quality on the knowledge obtained from the results. For example, on one end we may have multifactorial perturbation frameworks where we collect samples with little or no control over the perturbations effected on each samples and we attempt to infer what variables are correlated in the system. Here, data is cheap, as it can even be collated from already available datasets, but the inferences obtained are noisy and require extensive downstream validation. Multifactorial perturbation is very popular in genomics, especially in transcriptomics, since a deluge of publicly available data means we can cheaply produce hypotheses, providing a useful starting point. On the other end, we might have the perturb-and-rescue experimental framework. Here, one is interested in proving the dependency of some function $f$, read out by some signal $S$, in the relationship between $A$ and $B$ (e.g. might be the functional coupling of two genes, the pairing of two particular residues in a macromolecule, etc.) and attempts to do so by first perturbing e.g. $A$ to sever the relationship with $B$, watch for a drop of signal $S$, then perturb $B$ in such a way that one would expect the relationship to $A$ to be restored. If the signal $S$ turns on again, then the $(A,B)$ relationship was ‘rescued’. This is readily illustrated by attempting to prove that two residues, say $R_1$ and $R_2$, of an RNA molecule are paired when it folds. Suppose that $R_1$ is an adenine and $R_2$ a uracil and that our readout $S$ is some signal that tells us whether those residues are exposed to solution (e.g. not paired to anything) or not. If we perturb $R_1$ by changing it to a cytosine, then we expect the signal $S$ will change from paired to unpaired status for both $R_1$ and $R_2$ since the pair would be broken. But if we then change $R_2$ to a guanine, we will expect $S$ to change again to paired status for both residues, thus rescuing the relationship. A perturb-and-rescue experiment is difficult as you need to be able to not only finely perturb each variable, but also require an additional perturbation that rescues the first. This particular experiment is a favorite of one of my doctoral advisors, who once said to me that he ‘could not think of a more convincing experiment’ to validate RNA secondary structure. It’s also used heavily in other contexts, like If the perturbation and rescue actions rest on solid theory (like nucleotide pyrimidine and purine pairing patterns in this case), then a positive outcome of the experiment is hard to refute and further validation would only be testing the generality of the system where the experiment was performed (say if this was done *in vitro* but we’re interested in an *in vivo* setting)

While intuitively obvious that perturb-and-rescue is more informative than multifactorial perturbation, is there a way we can formally express this? One road we can take is the Bayesian one, in particular, the framework of Bayesian experimental design. In this setting, we have a set of experiments $E$ that we can choose $e_1, e_2,…, e_n$ with $e_i \in E$, which result in outcomes (i.e. experimental results) $S_1, S_2, …, S_n$ and the goal is to minimize some notion of uncertainty $U$ (e.g. entropy) acting over the posterior over the hypothesis we’re trying to test $P(H|D) \propto P(D|H)P(H)$ where $D = (e_1, S_1), (e_2, S_2), …, (e_n, S_n)$. The main hurdle to compute $P(H|D)$ is reasoning over experiments and outcomes that have not yet happened. To get around this, we might have distribution on the outcomes given each experiment, that is, $P(S_i|H, e_i)$. This approach works well if we perform experiments sequentially, as we can incorporate the results of each experiments as they come: at each point $t-1$ in the sequence of experiments one typically computes some probability distribution over the outcome of the next experiment given all the previous ones $P(S_t|H, e_t)P(H|(e_1, S_1), …, (e_{t-1}, S_{t-1}))$. By thinking of $S_t$ probabilistically, we can calculate the expected value of $U$ and take the next experiment that minimizes that. The structure of $P(S_t|H, e_t)P(H|(e_1, S_1), …, (e_{t-1}, S_{t-1}))$ dictates the type, complexity, and approach used to solve the problem that can take the form of a dynamic programming or MCMC solution. In some instances, where for example $E$ is some continuous interval and we expect nearby experimental settings in $E$ to yield similar outcomes, the task becomes very similar to a Bayesian optimization problem where a Gaussian process with some covariance function that captures this local dependencies can be used to approach the outcome variable and is sequentially updated trying to minimize its uncertainty.

So what happens when you apply the Bayesian experimental design framework to a multifactorial perturbation set of experiments? Here, we can think of each sample in a multifactorial perturbation as a separate experiment. Because we have no control on the perturbations, we have a very uninformative prior on what we expect from each experiment, that is, our $P(S_i|H, e_i)$ is basically the same for any $i$ and most of the time it assumes a Gaussian form where the interactions of the variables of interest are encoded in the covariance matrix. Thus, each experiment is equally informative and even after updating our model after several experiments using $P(S_t|H, e_t)P(H|(e_1, S_1), …, (e_{t-1}, S_{t-1}))$, the information provided by the rest of the experiments remain unchanged relative to each other. This is the least informative experimental approach, but it also requires the least effort: no perturbation control, least assumptions in the model. If we want to break this uniformity, we need to either start controlling the perturbations or start assuming some structure within each experiment (e.g. maybe some of the samples come from a special subpopulation where we have a prior on their distribution).

Now, suppose that we apply the same rule to the perturb-and-rescue problem, where we want to test the necessity of the relationship between $A$ and $B$ to some function $f$, given some signal $S$ that’s a readout of $f$ in some way. To make things simple, say that our signal is binary $S \in \{ 0, 1 \}$ and indicates whether $f$ is active it’s one and zero otherwise, and that the set of perturbations we can perform are $\{ P_A, P_B, P_AR_B \}$. The notation for the experiments require some explanations:

- $P_X$, where we perturb $X$ and expect $S$ to turn off, i.e. become zero.
- $P_XR_Y$, where we perturb $X$ but also design a perturbation of $Y$ that we expect will rescue $f$ and yielding $S = 1$

We further assume that we have done a control experiment, where we observe that $S = 1$ as the function $f$ is unperturbed. One may be tempted to declare $\{ P_A, P_B, P_AR_B \}$ as the set of our available experiments. However, this is not a correct model, as in our mental model $P_A$ and $P_AR_B$ share information and should be put into the same probability distribution. Instead, our experiments become binary vectors of the form $(p_A, p_B, p_Ar_B)$ where each component indicates if the corresponding perturbation was included in the experiment (e.g. a vector $(1,0,0)$ would indicate that perturbations $P_A$ were performed but not $P_B$ or $P_AR_B$). The outcomes are also binary vectors of dimension 3, corresponding to the resulting signal $S$ for each perturbation. Here, our $P(S_i|H, e_i)$ is very tight, as we have precise likelihoods of what we expect the signal to be. Not only that, but there is one and only one maximally informative experiment above all others, the actual perturb-and-rescue experiment: $(p_A=1, p_B=0, p_Ar_B=1)$.

There’s a sort of satisfaction that comes with putting experimental approaches int his framework, as it forces one to write down the actual assumptions. In the mutate-and-rescue case for example, I kept trying to make each perturbation its own experiment, but kept arriving at conceptual contradictions, so even defining what the actual experimental unit is can be counter to our intuition. Maybe we don’t ever need to specify a likelihood function or perform any calculation, but having even a qualitative understanding of the structure of our assumptions can go a long way.