Indirect Comparison: a fundamental breakthrough in evidence synthesis But how valid is it? [Under review]

This is a CORRECTED version of the following post on ECMI- European Consortium for Mathematics in Industry

[Under review]

It is quite unrealistic to decide on, the benefits and harms of an intervention or, the validity of a clinical hypothesis, based on a single study alone. Results, often vary from one study to the next. Meta-analytic approaches have been conceptualized gradually for decades now, to synthesize the body of evidence gathered through a systematic search process. They typically generate estimates of the comparative effects of all included interventions, along with their range of uncertainty. They are invaluable tools to healthcare entities and, regulatory agencies around the world, to make recommendations on pharmaceutical products. Some of those agencies are, the Health Products Regulatory Authority HPRA in Ireland and the Food and Drug Administration FDA in the US.

We call the XY trials, the body of evidence of all the relevant trials comparing treatment X and treatment Y. We restrict our attention to Randomized Controlled Trials (RCTs). A meta synthesis of the XY trials or Meta Analysis (MA) would produce an overall pooled estimate (a weighted average) of the effect of Y relative to X denoted by dXY.

Figure 1: The Indirect Comparison of A relative to B

Bucher et al. (1997), introduced Indirect Comparison, to tackle situations in which we could have AC trials and BC trials only (Figure 1), and, policymakers urgently need to inform the public on how A performs relative to B. It is also referred to as Indirect MA or adjusted IC. Decision makers might not have enough patience to wait for, or they might lack sufficient funding, to carry out any comparative experiments (RCTs) on A and B. Thus, they are regularly compelled to rely on the IC tool.

The question is, to what extent is an IC estimate valid?

The IC of A vs B denoted by IndAB, is adjusted by the results of their direct comparisons (dAC and dBC) with the common comparator C. It follows that any discrepancy in the AB comparison will be inherited from the AC and BC-trials. We recall that, similarity between studies, is a key assumption in MA that ensures validity of results. In IC, we have to additionally stress on the similarity of studies across comparisons. So it is reasonable to examine the validity of the included trials, within, as well as across, the AC and the BC comparisons.

Randomly allocating patients in two or more intervention groups, is supposed to, create a balanced distribution of (un)known and (un)measured prognostic factors across intervention groups within a RCT. However, it is often not the case, and, randomization is not sufficient for comparability. It takes much more rigor and integrity to make the groups comparable. Some rigorous steps include:

  • pre randomization blinding (allocation concealment): treatment sequence should remain undisclosed to the parties
  • post randomization blinding: parties should remain unaware of who is taking what treatment
  • blinding of the outcome: investigators might report subjective outcome intervention effect which might not reflect the true outcome response.

Blinding of, outcome assessors, clinicians and or (some) participants, is important in a randomized trial as it ensures integrity of the process. Lack/unclear blinding, inadequate allocation concealment, are markers of what is called selection bias. They are known to be associated with biased results. Biased is with respect to the population we would want to make recommendations for. These may result to an overestimate of the benefits of treatments and an underestimate of their harmful potentials. Another common bias is known as

  • Attrition bias: Analysis of individual RCTs should be done in an Intention To Treat basis. Once a patient is randomized to a treatment, the final analysis should be done respecting that allocation even if the patient dropped out or was given a different treatment due to less tolerability or toxicity of the initial treatment he was assigned to. Patients with missing treatment response should not be discarded from the final statistical analysis of the trial, otherwise, this would reduce the advantages of randomization as it virtually breaks the randomization.
  • Publication bias: Negative results often arise from small studies and subsequently, are less likely to be published. It follows that, although they would increase the sample size of the body of evidence, an attention should be paid on the impact of the negativity they bring with them in the simultaneous statistical synthesis.

Reporting bias, performance bias, and many other such markers, related to the methodological flaws of a trial, are threat to the internal validity of a trial.

A simple way to visualize this is that, before a trial is conducted, a pre-specified question is defined with a target population in mind. However, the patients finally included in the analysis might lack some characteristics defined a priori in the target population. Included patients are not required to be representative of the target population. Thus, we might end up having an unbiased result with respect to the included patients in the trial, but, the result may be biased with respect to the target population. Some few markers that are threats to the external validity of trials are:

  • Patients characteristics: disease severity at baseline, age, gender…
  • Trial characteristics/settings: primary care, secondary, tertiary care settings…

Both types of markers (internal and external), aforementioned, are susceptible to modify relative treatment effects and as such, are referred to as effect modifying covariates. The concept of similarity between studies in MA is thus with respect to such markers.

We have just seen that, analysis of individual trials is vulnerable to both internal and external biases. The same way, the analysis of a body of trials is exposed to such biases.

Randomizing patients is done within trials and not across trials. It follows that there is indeed a risk that patients characteristics are not comparable across trials and so not comparable across comparisons, on average. Trials/patients characteristics, outcome measures, follow up duration, treatment dose, and many other (un)known, might not be comparable across trials as well.

Researchers relate, external validity to generalisability. It allows us to say that, if A is better than C for patients in the AC trials, then it is still better than C for patients in the BC-trials. Similarly, extends the efficacy of B relative to C in the AC trials population. Doing so assumes, we are assessing the same outcome within a three arm trial ABC and we know that within RCTs:

  • relative treatment effects are consistent: IndAB=dAB=dAC+dCB, implying that if B is better than C i.e dCB>0 and C better than A i.e dAC>0 then B is better than A as the sum of two positive values, making C transitive. We call d, the difference in treatment outcomes. d could be the log Odds-Ratio or the mean difference.
  • treatments have the same indications: we would not want A to be a first line regimen while B and C are second or fourth line.
  • patients population in C are assumed homogeneous: if patients randomized in C from the AC trials differ in any characteristics (age, dose administration) from the one from the BC trials, then they should be similar on average (e.g similar mean age). A special scenario to avoid is allow C be an ointment in the AC trials but rather an injection in the BC-trials.

Therefore, generalisability in IC relies on the similarity of, patients/trials, known and unknown characteristics, within and across comparisons.

The imbalance in the effect modifiers across the AC and BC comparisons, is said to affect the indirect comparison IndAB by confounding bias. These being said, to assess the validity of an IC, one just has to check the similarities of the average distributions of the known effect modifiers. However, if the two sets of comparisons are similarly biased, the IC will be unbiased.

Sometimes, the direct evidence on AB is available but insufficient (e.g. Figure 2). It might then borrow strength from the indirect evidence, when combined in a Mixed Treatment Comparison MTC or a Network Meta Analysis NMA. This increases statistical power and improve precision on the AB comparison. However, if the direct and the indirect evidence on AB, do not both reflect similarly, the true relationship between the two treatments, their combination would be invalid. Knowing which one of indirect or direct evidence provides less biased estimate than the other is subject to further research.

Figure 2: Here we have a Mixed (direct + indirect via C) evidence of A vs B

When appraising a new intervention, models that include study/patient level covariates, are best to assess potential added value and to identify subgroups where its efficacy appears most promising. However, effect modifying covariates are often under reported. Therefore, policy-makers might make uncertain or wrong decisions, because, based on lower quality or limited evidence. Postponing the decision might be clever. However, being transparent and explicit will surely increase the quality of decision. Following Bucher’s work et al. (1997), Lu and Ades (2004) have conceived MTC in a Bayesian framework incorporating multi-arm studies (and so would include loops of evidence). This approach has been powerful to show how parameter uncertainty can be combined with variation within individual trials and heterogeneity in meta analyses.

  1. Heiner C. Bucher et al. (1997). The Results of Direct and Indirect Treatment Comparisons in Meta-Analysis of Randomized Controlled Trials.J Clin Epidemiol 50;6:683–691
  2. Jeroen P. Jansen et al. (2011). Interpreting Indirect Treatment Comparisons and Network Meta-Analysis for Health-Care Decision Making: Report of the ISPOR Task Force on Indirect Treatment Comparisons Good Research Practices: Part 1. International Society for Pharmacoeconomics and Outcomes Research (ISPOR) doi:10.1016/j.jval.2011.04.002. Elsevier Inc.
  3. F. Song et al. (2008). Adjusted indirect comparison may be less biased than direct comparison for evaluating new pharmaceutical interventions. doi: 10.1016/j.jclinepi.2007.06.006 Journal of Clinical Epidemiology 61
  4. Sofia Dias, A. E. Ades, et al. (2018). Network Meta-Analysis for Decision-Making. New York, John Wiley & Sons, Inc.

Teaching/Research Assistant in Mathematics/(Bayesian) Statistics, Writer

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store