Pooled analyses (ie, non-exhaustive quantitative syntheses that pool individual participant data from several independent RCTs exploring similar research questions) are frequently used in the evaluation of new treatments, and can influence day-to-day clinical practice. Those studies often explore subgroup effects or examine new questions across different combinations of all available studies.
However pooled analyses are not based on an exhaustive sample of RCTs in comparison with the full set used in IPD meta-analysis. Many pooled analyses are run with the risk of manipulating the results, by selecting favorable combinations of studies.
This is what has been observed for duloxetine in the treatment of depression with at least 43 pooled analyses. Such ‘Salami slicing’ may be used by the drug industry to help widely disseminate positive findings regarding its products.
Indeed, analyses across different combinations of trials and repeated endpoint measures can lead to variability in the observed effects estimated (ie, vibration of effect, VoE). We therefore planned to evaluate the impact of conducting all possible pooled analyses across different combinations of randomised controlled trials and endpoints. We explored this question with the example of canglifozin, a drug used to treat type 2 diabetes mellitus. It is a very interesting example because the clinical value of drugs that reduce chronic hyperglycaemia, as measured by HbA1c, remains uncertain because of less clear effects on clinical outcomes, such as cardiovascular events, e.g. for metformin.
A multiverse analysis
We requested individual patient data of 15 094 trial participants from 12 randomised controlled trials comparing canagliflozin treatment with placebo, shared by Johnson and Johnson on the Yale University Open Data Access project (YODA) platform, up to 16 April 2021.
We performed a multiverse analysis, consisting of all possible pooled analyses using these individual participant data. Pooled analyses estimated changes in serum glycated haemoglobin (HbA1c), major adverse cardiovascular events, and serious adverse events at weeks 12, 18, 26, and 52. The distribution of effect estimates was calculated for all possible combinations, and the direction and magnitude of the first and 99th centiles of effect estimates were compared. Our hypotheses were that vibration of effect would not be observed for HbA1c, but would be observed for both major adverse cardiovascular events (MACE) and serious adverse events. The presence of a Janus effect was investigated by calculating the 1st and 99th centiles of the distribution of the effect estimates. Substantial VoE occurs when the 1st and the 99th centiles of the effect estimates of pooled analyses are in the opposite direction. Across 16 332 distinct pooled analyses for changes in HbA1c, standardised effect estimates were in favour of canagliflozin treatment at both the 1st centile (−0.75%) and 99th centile (−0.48%); 15 994 (98%) analyses showed significant results (P<0.05) in favour of canagliflozin. For major adverse cardiovascular events, estimated hazard ratios were 0.20 at the first centile and 0.90 at the 99th centile; 2705 of 8144 analyses (33.21%) were significant, all of which were in favour of canagliflozin treatment. For serious adverse events, estimated hazard ratios were 0.59 at the first centile and 1.14 at the 99th centile; 5793 of 16 332 (35.47%) analyses were significant, with 5754 in favour of canagliflozin and 39 in favour of placebo.
We observed no Janus effect for the mean difference in HbA1c, although the point estimate varied considerably. With respect to our analyses examining the vibration of effect on major adverse cardiovascular events, a clinically relevant outcome, we observed no Janus effect. However, the VoE had an impact on the detection of canagliflozin efficacy on major adverse cardiovascular events.
Observing changes in the direction of effect estimates and occasionally in significance is to be expected because of sampling variability only. e.g. this other figure looking at 95% confidence intervals. Heterogeneity, bias in some of the initial studies, and the magnitude of the effect might also affect the existence of vibration of effect and the presence of a Janus effect. However, we believe that the bigger concern for pooled analyses is the presence of selection or availability bias in the IPD used in the meta-analysis. These findings suggest that results from pooled analyses should be critically appraised. Selection or availability bias in the individual participant data retrieved could affect the existence of vibration of effect.
Vibration of effects
Vibration of effects has been described in many fields such as clinical epidemiology and. microbiome-clinical associations. Modeling VoE has been proposed as "a critical step in navigating discovery in observational data, discerning robust associations, and cataloging adjusting variables that impact model output."
Meta-analyses are supposed to be less affected by researchers' degree of freedom. Still, in a first case study, we found substantial VoE in an indirect comparison of nalmefene versus naltrexone in the treatment of alcohol use disorders. Our new study explores VoE in a new context, i.e. pooled analyses of IPD data. A third case study in head to head meta-analyses of acupuncture will follow very soon.
We think that the vibration-of-effect approach shows promise in exploring issues related with reproducibility, especially because overlapping meta-analyses with divergent conclusions are not rare in the literature. However, to recommend implementing the method in all IPD meta-analysis/pooled analyses would be immature. We recommend that future research systematically explores vibration of effect in a large set of meta-analyses in order to give a better indication of its relevance.