Standardized mean differences cause funnel plot distortion in publication bias assessments
Abstract
Metaanalyses are increasingly used for synthesis of evidence from biomedical research, and often include an assessment of publication bias based on visual or analytical detection of asymmetry in funnel plots. We studied the influence of different normalisation approaches, sample size and intervention effects on funnel plot asymmetry, using empirical datasets and illustrative simulations. We found that funnel plots of the Standardized Mean Difference (SMD) plotted against the standard error (SE) are susceptible to distortion, leading to overestimation of the existence and extent of publication bias. Distortion was more severe when the primary studies had a small sample size and when an intervention effect was present. We show that using the Normalised Mean Difference measure as effect size (when possible), or plotting the SMD against a sample sizebased precision estimate, are more reliable alternatives. We conclude that funnel plots using the SMD in combination with the SE are unsuitable for publication bias assessments and can lead to falsepositive results.
https://doi.org/10.7554/eLife.24260.001Introduction
Systematic reviews are literature reviews intended to answer a particular research question by identifying, appraising and synthesizing all research evidence relevant to that question. They may include a metaanalysis, a statistical approach in which outcome data from individual studies are combined, which can be used to estimate the direction and magnitude of any underlying intervention effect, and to explore sources of betweenstudy heterogeneity. Simultaneously, metaanalysis can be used to assess the risk of publication bias: the phenomenon that published research is more likely to have positive or statistically significant results than unpublished experiments (Dwan et al., 2013). Metaanalyses are routinely used in clinical research to guide clinical practice and healthcare policy, reduce research waste and increase patient safety (Chalmers et al., 2014). The use of metaanalysis continues to increase (Bastian et al., 2010) and it has become more common to apply these approaches to the synthesis of preclinical evidence (Korevaar et al., 2011). Importantly, preclinical studies are, generally, individually small, with large numbers of studies included in metaanalysis, and large observed effects of interventions. This contrasts with clinical research, where metaanalyses usually involve a smaller number of individually larger experiments with smaller intervention effects.
This calls for methodological research to ascertain whether approaches to data analysis routinely used in the clinical domain are appropriate in the preclinical domain and for resources that guide and inform researchers, reviewers and readers on best practice. In this light, we present findings which show that the use of the standardized mean difference (SMD) measure of effect size in funnel plots can introduce a risk of incorrect assessment of publication bias, particularly in metaanalyses of preclinical data characterised by a large number of individually small studies with large observed effects.
Formulation of raw mean difference, standardized mean difference and normalized mean difference
To combine data statistically on e.g. the effects of an intervention which has been tested in several studies, outcome measures first need to be expressed on a common scale. Such scales include (for binary outcomes) the risk or odds ratios; and for continuous data a raw mean difference (RMD), SMD or normalized mean difference (NMD).
The RMD can be used when all outcome data are in the same measurement unit, and the interpretation of the outcome is the same in all settings (i.e. the reported measurement unit of the change in outcome has the same meaning in all studies). The RMD is calculated by subtracting the mean outcome value in the control group (M_{ctrl}) from the mean in the intervention group (M_{int}):
The observed standard deviation (SD) is likely to differ between experimental groups, and therefore the standard error (SE) of the RMD is calculated as:
where n is the sample size per group.
In cases where the measurement unit, or the interpretation of the outcome, or both differ between studies (e.g. a given change in infarct size measured in mm^{3} has a different consequence in the mouse brain than in the rat brain), the intervention effect may be expressed as an SMD. For each study the SMD is obtained by dividing the RMD by that study’s pooled standard deviation (SD_{pooled}) to create an effect estimate that is comparable across studies:
, where SD_{pooled} is:
Thus, the SMD expresses the intervention effect in all studies in the same new unit: the SD.
For each study, the standard error (SE) of the SMD can be approximated using the sample sizes (n) and the effect estimate (SMD):
Of note, Equations 3 and 5 estimate the SMD using the approach of Cohen (Cohen, 1988); this estimate is therefore termed Cohen’s d. However, Cohen’s d tends to overestimate the ‘true’ SMD and its variance when the sample sizes in the primary studies are small (e.g. <10). This bias can be corrected using the approach of Hedges (Hedges, 1981), which adjusts both the SMD estimate and its variance by a correction factor based on the total sample size. The resulting estimate is the unbiased SMD known as Hedges’ g (see Supplementary file 2 for full equations). In many clinical metaanalyses, Hedges’ g will be almost identical to Cohen’s d, but the difference between the estimates can be larger in preclinical metaanalyses, where small sample sizes are more common.
A third effect measure commonly used for continuous data in preclinical metaanalyses is the normalised mean difference (NMD), which relates the magnitude of effect in the intervention group to that seen in untreated animals, with reference to the outcome in a normal, healthy animal (Vesterinen et al., 2014). A condition for using the NMD is that the baseline measurement in an untreated, unlesioned ‘sham’ animal is known, or can be inferred. For each study, the NMD is calculated as:
where M_{sham} is the mean score for normal, unlesioned and untreated subjects. The corresponding SE is calculated as:
(see Supplementary file 2 for additional equations and (Vesterinen et al., 2014) for a comprehensive overview of (preclinical) metaanalysis methodology).
Note that Equation 5 dictates that the SE_{SMD} is correlated to the SMD effect size, whereas the SEs of the RMD (Equation 2) and NMD (Equation 7) are independent of the corresponding effect sizes.
Funnel plots and publication bias
Funnel plots are scatter plots of the effect sizes of the included studies versus a measure of their precision, usually the SE or 1/SE. In the absence of bias and heterogeneity, funnel plots should be funnelshaped and symmetrically centred around the summary effect estimate of the analysis, since 1) imprecise (smaller) studies will deviate further from the summary effect compared to precise (larger) studies and 2) studies are equally likely to overestimate or underestimate the true effect (Figure 1A). Assessment of the possible presence of publication bias frequently relies on a visual or analytical evaluation of funnel plot asymmetry. If studies showing small, neutral or controversial effects are more likely to remain unpublished, publication bias may occur. As a result, the funnel plot will become asymmetrical, and the summary effect estimate will shift accordingly (Figure 1B). Importantly, there are other causes of asymmetry in funnel plots. For instance, the true effect size in smaller (and therefore less precise) studies may be genuinely different from that in large studies (for instance because the intensity of the intervention was higher in small studies). For this reason, funnel plot asymmetry is often referred to as a method to detect small study effects, rather than being a definitive test for publication bias (Rothstein et al., 2005). In addition, artefacts and chance may cause asymmetry (as shown e.g. in this study).
Theoretical explanation of SMD funnel plot distortion
In a metaanalysis using the SMD as effect measure, in the absence of publication bias, observed SMDs in a funnel plot will be scattered around the true underlying SMD. However, the dependency of the SE_{SMD} on the observed SMD will impact the appearance of the funnel plot. When we review the equation for the SE_{SMD}, (Equation 5) the first component on the right of the ‘=' sign reflects the variance of the difference between the two group means, rescaled into pooled standard deviation units. Consequently, in this first part only n_{ctrl} and n_{int} play a role. The second component includes the squared SMD, and reflects the variation in the withingroups standard deviation as measured by SD_{pooled} (Equation 4).
If there is no intervention effect, the SMD (and the second component) will be zero, and the SE will therefore depend solely on the sample size (Equation 5 and Figure 2A). If an intervention effect is present, the SE will increase, as the size of SMD^{2} in the equation will increase. This is no problem if the observed SMD is similar to the true SMD. However, a study with an observed SMD larger than the true SMD will have a larger SE. On the other hand, a study with an observed SMD smaller than the true SMD (but >0) will have a relatively small SE (Figure 2B). This will cause funnel plot distortion: studies with a relatively small effect size (and associated SE) will skew towards the upper left region of the plot, while studies with a relatively large effect size and SE will skew towards the bottom right region of the plot, as the associated SE of these studies will be relatively large. Because the SMD is squared in the equation for the SE, this holds true for both positive and negative SMDs (Figure 2C). The smaller the first component of Equation 5, the larger the influence of the SMD on the size of the SE, worsening the distortion when sample sizes are small. Of note, this component is smallest when group sizes are unequal. The effect of the second component on the SE, and the resulting distortion, is largest if the sample size is small and the SMD is large (Figure 2D).
In summary, a funnel plot using both the SMD and its SE may become asymmetrical in the absence of publication bias. When funnel plot distortion is assessed by visual inspection, this skewing might cause the plot to be interpreted as being asymmetrical and lead the observer to erroneously conclude that publication bias is present. Furthermore, funnel plot asymmetry is often tested statistically using Egger’s regression (Egger et al., 1997) or Duval and Tweedie’s trim and fill analysis (Duval and Tweedie, 2000), but neither of these analyses take the phenomenon described above into account, and their use may lead to erroneous conclusions that publication bias is present.
Aim of this study
We investigated the reliability of RMD, SMD and NMDbased funnel plots for the assessment of publication bias in metaanalyses, using both empirical datasets and data simulations. We investigate the effect on the severity of funnel plot distortion of the study sample size, the number of studies in the metaanalysis and the magnitude of the intervention effect. We assess whether distortion can be avoided by using a precision estimate based on the sample size of the primary studies, as previously suggested for mean difference outcome measurements (Sterne et al., 2011). We then use this alternative approach to reanalyse published funnel plots, and show that these systematic reviews may have overestimated the severity of publication bias in their body of evidence. Our findings have important implications for the metaresearch field, since authors may have reached incorrect conclusions regarding the existence of publication bias based on funnel plots using the SMD measure of effect size.
Results
Publication bias assessment using RMD versus SMD funnel plots of two preclinical RMD datasets
Dataset 1 (ischaemic preconditioning) contains 785 individual effect sizes (Wever et al., 2015). In the original analysis using the RMD as effect measure, funnel plot asymmetry was detected by Egger’s regression (p=1.7×10^{−5}), but no additional studies were imputed in trim and fill analysis (Figure 3A). When expressing the same data as SMD, funnel plot asymmetry increased substantially (Figure 3B; p<1.0×10^{−15}, Egger regression) and 196 missing studies were imputed by trim and fill analysis, leading to adjustment of the estimated SMD effect size from 2.8 to 1.9.
Dataset 2 (stem cell treatments) contained 95 individual effect sizes (Zwetsloot et al., 2016). Funnel plot asymmetry was detected in the original analysis using RMD (p=0.02) and trim and fill analysis suggested a reduction in effect estimate of 0.1% after filling two additional studies (Figure 3C). In contrast, a funnel plot of the same data expressed as SMD showed asymmetry at a higher level of statistical significance (p=3.4×10^{−10}, Egger regression), but no missing studies were imputed (Figure 3D).
Data simulation results
Results of our first simulation (in the absence of publication bias) are shown in Table 1, and representative funnel plots of these simulations in Figure 4 (small study sample size) and Figure 4—figure supplement 1 (large study sample size). When we simulated no intervention effect, neither Egger’s regression nor trim and fill analysis gave different results for the RMD vs. SE and SMD vs. SE analyses (Table 1, Figure 4A,B,E and F and Figure 4—figure supplement 1, panel A, B, E and F) and in ~95% of cases there was no evidence of asymmetry. Most simulated funnel plots were assessed as symmetrical, however, as expected, around 5% of the cases were considered asymmetrical by chance.
When we simulated the presence of an intervention effect (Δμ = 10; RMD = 10 and SMD = 1 or Δμ = 5; RMD = 5 and SMD = 0.5), again around 5% of the RMD funnel plot analyses were judged asymmetrical (Table 1, Figure 4C and G, and Figure 4—figure supplement 1, panel C and G). In contrast, when using the SMD, funnel plot asymmetry was detected in over 60% of the simulated funnel plots with Δμ = 10, where the size of contributing studies was small (Figure 4D and H and Figure 4—figure supplement 1, panel D and H), increasing as the number of individual studies contributing to the metaanalysis increased. When we modelled larger individual contributing studies (n = 60–320 subjects), respectively 9%, 34% and 100% of the SMD funnel plots with 30, 300 or 3000 studies were assessed as asymmetrical (Table 1, Figure 4—figure supplement 1). Trim and fill analysis resulted in on average 7% extra studies filled in preclinical simulation scenarios using the RMD. Adjusting the overall effect estimate based on these filled data points improved the estimation of the simulated RMD in all scenarios. However, when using the SMD, the number of filled studies was much higher in many scenarios (up to 21% extra studies filled). As a result, the adjusted overall effect estimate after trim and fill in SMD funnel plots tended to be an underestimation of the true effect size. Finally, through visual inspection, distortion could be seen in all SMD funnel plots that incorporated a true effect, most prominent in the preclinical (small study) scenarios (Figure 4 and Figure 4—figure supplement 1).
When repeating the simulations using Cohen’s d SMD instead of Hedges’ g, or using Begg and Mazumdar’s test, we found highly similar results in all scenarios simulated (see Supplementary file 1 and exemplary funnel plots in Figure 4—figure supplement 2).
Next, we assessed the impact of censoring nonsignificant simulated experiments (to simulate publication bias) and the performance of SMD vs. 1/√n funnel plots and NMD funnel plots in the presence of an intervention effect as alternatives to the SMD vs. SE funnel plot. As in simulation 1, SMD vs. SE funnel plots of unbiased simulations were identified as asymmetrical by Egger’s test (Table 2). However, when the precision estimate was changed from SE to 1/√n, the prevalence of false positive results fell to the expected 5% (Table 2). For the NMD, Egger’s test performed correctly when using either the SE or 1/√n as precision estimate. In all scenario’s, approximately 50 out of 1000 simulated funnel plots appeared to be asymmetrical by chance (Table 2). The results of Egger’s test are supported by visual inspection of funnel plots of these unbiased scenario’s (Figure 5). The typical leftupward shift of the small SMD datapoints and rightdownward shift of the large SMD data points is clearly visible in the SMD vs. SE plot (Figure 5B), but not in the RMD, SMD vs. 1/√n or NMD plots.
In our final simulation we tested the performance of these different approaches in the presence of simulated publication bias. In the majority of these simulations of metaanalyses of individually small studies, asymmetry was detected both visually (Figure 6), and using Egger’s regression (Supplementary file 1). When the size of individual studies was small, SMD vs.1/√n funnel plots performed as well as the RMD vs. SE funnel plots, in both biased and unbiased simulations (Table 2). The NMD also behaved similar to the RMD with either an SE or 1/√n precision estimate.
Reanalyses of SMD funnel plots from published metaanalyses
Since a sample sizebased precision estimate might be more suitable for asymmetry analysis, we used data from five previously published metaanalyses which used an SMD vs. SE funnel plot and claimed funnel plot asymmetry as a result of publication bias. In the original publications, all five of these funnel plots were asymmetrical according to Egger’s regression test. In three out of five cases, this asymmetry was not present in funnel plots using 1/√n as a precision estimate (Table 3 and Figure 7). Furthermore, three out of five papers reported several missing data points, as detected by trim and fill analysis. Missing data points were not detected when using SMD vs. 1/√n funnel plots for trim and fill analysis (Table 3 and Figure 7).
Discussion
Using data from both simulated and empirical metaanalyses, we have shown that the use of Egger’s regression test for funnel plot asymmetry based on plotting SMD against SE is associated with such a substantial overestimation of asymmetry as to render this approach of little value, particularly when the size of contributing studies is small. This distortion occurs whenever an intervention effect is present, in metaanalyses both with and without publication bias. The severity of distortion and the risk of misinterpretation are influenced by the sample size of the individual studies, the number of studies in the metaanalysis, and the presence or absence of an intervention effect. Thus, the use of SMD vs. SE funnel plots may lead to invalid conclusions about the presence or absence of publication bias and should not be used. Since it is the association between the SMD and its SE that leads to funnel plot distortion, it almost inevitable that the issues described will occur with any test for publication bias that relies on an assessment of funnel plot asymmetry (e.g. Begg and Mazumdar’s test [Begg and Mazumdar, 1994]). When using trim and fill analysis, funnel plot distortion introduces the risk of incorrectly adjusting the summary effect estimate. Previous reports of the presence of publication bias based on this approach should be reevaluated, both for preclinical and clinical metaanalyses. Importantly, distortion does not occur in NMD vs. SE funnel plots, which formed the basis of a recent analysis showing evidence for substantial publication bias in the animal stroke literature (Sena et al., 2010).
As the use of metaanalysis to summarize clinical and preclinical data continues to increase, continuous evaluation and development of research methods is crucial to promote highquality metaresearch (Ioannidis et al., 2015). To our knowledge (see also Sterne et al., 2011), potential problems in tests for funnel plot asymmetry have not been extensively studied for SMDs, and guidance is limited. For instance, the Cochrane Handbook for Systematic Reviews of Interventions (Yan et al., 2015) states that artefacts may occur and that firm guidance on this matter is not yet available. It is disquieting that publication bias analyses using SMD funnel plots have been published in clinical and preclinical research areas, presumably because both the authors and the peer reviewers were unaware of the risk of spurious publication bias introduced by this methodology. Accepted papers from our group and others using SMDs for publication bias assessments have passed the peer review system, with no additional questions and or comments on this potential problem.
A similar phenomenon has been reported for the use of odds ratios in funnel plots, which also induces artificial significant results in Egger’s regression (Peters et al., 2006). Here, too, an alternative test based on sample size has been proposed to circumvent this problem (Peters et al., 2006), and we suggest to extend this recommendation to SMDs.
However, given the relative performance of the RMD, NMD and SMD approaches, it is reasonable to consider whether SMD should ever be used. The RMD approach is limited because there are many instances (for example across species) where, although the same units of measurement are used, a given change may have very different biological importance. The NMD approach is preferred, but – because it expresses the effects of an intervention as a proportion of lesion size – there may be circumstances where outcome in a nonlesioned animal is not reported or cannot be inferred, and here the NMD approach is not possible. Further, the relative performance of RMD, NMD and SMD approaches in identifying heterogeneity between groups of animal studies (partitioning of heterogeneity) or in metaregression is not known.
Taken with the increased distortion seen when contributing studies are individually small, this means our findings may be especially relevant for preclinical metaanalyses. The SMD is frequently used in preclinical metaanalyses to overcome expected heterogeneity between data obtained from different animal species. Nevertheless, the SMD is also used in clinical metaanalyses and the degree of distortion cannot be readily predicted. In any case, distortion causes the threshold for determining publication bias to be artificially lowered when using SMDs and their SE, increasing the chance of falsepositive results.
Of note, trim and fill analysis may not always be reliable when the number of studies in a metaanalysis is large; in half of the cases of our unbiased simulations with 300 and 3000 studies, many studies were deemed missing, even if no intervention effect was introduced. Still, the SMD simulations were always more susceptible to the addition of imputed studies if a true effect was introduced, and the effect size reduction was larger compared to RMD measurements.
Limitations of this study
We designed our data simulations to closely resemble empirical data in terms of the range of sample sizes, effect sizes and numbers of studies in a metaanalyses. We acknowledge that our current range of simulation scenarios does not enable us to predict the impact of funnel plot distortion in every possible scenario, but we present those scenarios which most clearly illustrate the causes and consequences of funnel plot distortion. Furthermore, our simulations may still be improved by e.g. studying the effects of unequal variances between treatment groups, sampling data from a nonnormal distribution, or introducing various degrees of heterogeneity into the simulation. However, research on how to optimally simulate these parameters is first needed, and was beyond the scope of this study. instead, we used reanalyses of empirical data to test our proposed solutions on a number of reallife metaanalyses which include all of the aforementioned aspects.
Recommendations
We recommend that, where possible, investigators use RMD or NMD instead of SMD when seeking evidence of publication bias in metaanalyses. Where it is necessary to use SMD, assessment for publication bias should use a sample sizebased precision estimate such as 1/√n. In a given analysis it may be possible to calculate an NMD effect size for some but not all studies. In these circumstances there is a tradeoff between the reduced number of included studies and an improved estimation of publication bias, and sensitivity analysis may be used to compare the metaanalysis outcome using the NMD versus the SMD. Of note, other methods to investigate publication bias in a dataset may be used in addition to funnel plots (e.g. failsafe N, Excess Significance Test [Ioannidis and Trikalinos, 2007], or selection method / weight funcion model approaches [Peters et al., 2006]), but the performance of these approaches in the context of SMD, RMD and NMD estimates of effect size is not known.
In conclusion, funnel plots based on SMDs and their SE should be interpreted with caution, as the chosen precision estimate is crucial for detection of real funnel plot asymmetry.
Materials and methods
We performed data simulations and reanalyses of empirical data using R statistical software (version 3.1.2; RRID:SCR_001905) and the most recent MBESS, xlsx, meta and metafor packages (Rothstein et al., 2005; Kelley, 2016; Schwarzer, 2016; Viechtbauer, 2010; Dragulescu, 2014) (See Supplementary file 3 for all R scripts). For all analyses involving RMD and SMD the primary outcome of interest was the number of asymmetrical funnel plots as detected by Egger's regression (Egger et al., 1997). As a secondary outcome, we assessed the number of missing studies as imputed by Duval and Tweedie’s trim and fill analysis (Duval and Tweedie, 2000). This method provides an estimate of the number of missing studies in a metaanalysis, and the effect that these missing studies may have had on its outcome. In brief, the funnel plot is mirrored around the axis represented by the overall effect estimate. Excess studies (often small, imprecise studies with a neutral or negative effect size) which have no counterpart on the opposite side of the plot are temporarily removed (trimmed). The trimmed plot is then used to reestimate the overall effect estimate. The trimmed data points are placed back into the plot, and then a paired study is imputed with the same precision but reflected to have an effect size reflected around the adjusted overall estimate, and plotted in a different color or symbol from the observed data points. The analysis is rerun and repeated until no further asymmetry is observed. We used trim and fill analysis and a random effects model in R to seek evidence for publication bias overstating the effectiveness of the interventions, based on the proposed direction of the intervention effect. Because of its superior performance in studies with small sample sizes, Hedges’ g was used in the main analyses throughout this manuscript. We considered a pvalue of <0.05 to be significant for Egger’s regression in individual simulations.
Empirical data published as RMD reanalyzed as SMD
Request a detailed protocolIn our first reanalysis of empirical data from published preclinical metaanalyses (Wever et al., 2015; Zwetsloot et al., 2016), we constructed funnel plots using the unbiased SMD (Hedges’ g [Hedges, 1981]) vs. SE, and compared these to funnel plots using the RMD vs. SE (as in the original publication).
Data simulation methods
Request a detailed protocolIn our first simulation, we tested the estimation of publication bias using the unbiased SMD (Hedges’ g) in simulated data where there was no publication bias. As a sensitivity analysis, all scenarios of simulation 1 were also performed using Cohen’s d. We generated simulated metaanalyses by simulating the desired number of individual studies, each with a control group and an intervention group. The control groups were simulated by randomly sampling individual subject data from a normal distribution with a mean (M_{ctrl}) of 30 and an SD of 10 (Table 4); these values were based on outcome data for functional imaging in myocardial infarction studies (Zwetsloot et al., 2016). Individual subject data for the intervention group was sampled from a normal distribution with mean M_{ctr} +ES (effect size). To assess the effect of differences in overall intervention effects on funnel plot distortion, we simulated metaanalyses for an ES of respectively 0, 5, or 10 (Table 4). To assess the effect of study sample size on funnel plot distortion, we simulated two types of study sizes: small (12–30 subjects per study), as is more common in animal studies, and large (60–320 subjects per study), as is more common in human studies. For each simulated study, we determined the number of subjects by sampling the group sizes from the uniform distribution within the ranges of study sizes given (Table 4). Of note, an intervention effect of SMD = 1 may appear large to those experienced in metaanalyses of clinical data, but is typical of those observed in animal studies, as are the group sizes reported (see e.g. Figure 2 and Table 3).
Simulation and aggregation of individual subject data into studylevel data was repeated until the desired number of studies to be included in the metaanalysis was obtained. We assessed the influence of the number of included studies on funnel plot distortion by simulating metaanalyses containing either 30, 300, or 3000 studies. Although there is no consensus on the minimal number of studies required for publication bias analysis, 30 has been previously proposed as the minimal number to obtain sufficient power for asymmetry testing (Lau et al., 2006). We chose 3000 studies for the largest metaanalysis as this is substantially larger than any metaanalysis of which we know, and any effects of study number are likely to be saturated at that number of studies. Importantly, we did not introduce publication bias to any of these datasets and the funnel plots should therefore be symmetrical. We repeated each simulation 1000 times, and we compared the effects of expressing the metaanalysis results as RMD or SMD, and used funnel plots with the effects size plotted on the xaxis and the SE as precision estimate plotted on the yaxis (RMD vs. SE and SMD vs. SE plots). As a second sensitivity analysis, we assessed the robustness of our findings using Egger’s test by retesting all scenario’s of simulation 1 using Begg and Mazumdar’s test (Begg and Mazumdar, 1994).
Informed by the outcomes of simulation 1, in our second simulation we selected the conditions introducing the most prominent distortion in SMD vs. SE funnel plots to investigate the performance of alternatives including SMD vs. 1/√n funnel plots and NMD funnel plots. Thus, all simulations were performed with a small study sample size, in the presence of an intervention effect (see Table 4) and with 3000 studies per metaanalysis. Under these conditions, we constructed RMD vs. SE and SMD vs. SE funnel plots as described above, as well as funnel plots of the SMD against the inversed square root of the total sample size (1/√n) in each study, and of the NMD against the SE. For the NMD, sham group data were simulated to have a mean of 70 and an SD of 4 (Table 4). Group size was selected to be 4–6 subjects, which is a typical sample size for sham groups in preclinical experiments. We performed the simulations once and compared outcomes across all four funnel plots.
In our final simulation we investigated the effects of a modelled publication bias on the performance of the SMD vs. SE and alternative approaches. We simulated metaanalyses containing 300 and 3000 studies with a small individual sample size and an intervention effect present (Δμ = difference in means between control and intervention group = 10; see Table 4). RMD vs. SE, RMD vs. 1/√n, SMD vs. SE, SMD vs. 1/√n and NMD vs. SE funnel plots were constructed and tested for asymmetry using Egger’s regression. We then introduced publication bias in these metaanalyses using a stepwise method, Publication bias was introduced stepwise, by removing 10% of primary studies in which the difference between the intervention and control group means was significant at p<0.05 (Studentt test), 50% of studies where the significance level was p≥0.05 to p<0.10, and 90% of studies where the significance level was p≥0.10. Funnel plot asymmetry testing was performed as above, and the results were compared to the unbiased simulations and between different funnel plot types. All simulations were repeated 1000 times. Of note, this simulation was not performed for metaanalyses of studies with a large sample size, since pilot data showed that the large sample size will cause only very few studies to be removed from the ‘biased’ metaanalysis.
Reanalysis of empirical data using an nbased precision estimate
Request a detailed protocolFinally, to assess the usefulness and impact of using a sample sizebased precision estimate in SMD funnel plots of empirical data, we reanalysed data from five published preclinical metaanalyses that used SMD vs. SE funnel plots to assess publication bias. The selected datasets were from our own groups, or from recent collaborations, which allowed for easy identification of metaanalyses using SMD vs. SE funnel plots, and easy access to the data. There were no selection criteria in terms of e.g. the number of studies in the analysis, or the outcome of the publication bias assessment. The distribution of the total number of subjects per data point in the selected studies is (in median (minmax): 11.7 (6–38) for Wever et al. (2012), 20(1246) for Groenink et al. (2015), 11(424) for Yan et al., 2015, 14.5 (6–35) for Kleikers et al. (2015) and 12(466) for Egan et al. (2016). For these data sets, we compared the outcome of Egger’s regression and trim and fill analysis when using SMD vs. SE funnel plots to that of SMD vs. 1/√n funnel plots. We obtained the corresponding author’s consent for reanalysis.
References

BookStatistical Power Analysis for the Behavioral Sciences (2nd ed)Hillsdale: Lawrence Erlbaum.

From a mouse: systematic analysis reveals limitations of experiments testing interventions in Alzheimer's disease mouse modelsEvidencebased Preclinical Medicine 3:e00015–00032.https://doi.org/10.1002/ebm2.15

Distribution Theory for Glass's Estimator of Effect Size and Related EstimatorsJournal of Educational Statistics 6:107.https://doi.org/10.2307/1164588

An exploratory test for an excess of significant findingsClinical Trials 4:245–253.https://doi.org/10.1177/1740774507079441

BookPublication Bias in MetaAnalysis  Prevention, Assessment and Adjustments (1st edn)John Wiley and sons Ltd.

Metaanalysis of data from animal studies: a practical guideJournal of Neuroscience Methods 221:92–102.https://doi.org/10.1016/j.jneumeth.2013.09.010

Conducting MetaAnalyses in R with the metafor PackageJournal of Statistical Software 36:1–48.https://doi.org/10.18637/jss.v036.i03
Article and author information
Author details
Funding
National Institute of Environmental Health Sciences (National Toxicology Program research funding)
 Kimberley E Wever
Netherlands Cardiovascular Research Initiative (CVONHUSTCARE)
 Steven AJ Chamuleau
National Centre for the Replacement, Refinement and Reduction of Animals in Research (Infrastructure Award)
 Emily S Sena
 Malcolm R MacLeod
Alexander Suerman Program (PhD student Scholarship)
 PeterPaul Zwetsloot
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Version history
 Received: December 14, 2016
 Accepted: August 21, 2017
 Accepted Manuscript published: September 8, 2017 (version 1)
 Version of Record published: September 29, 2017 (version 2)
Copyright
© 2017, Zwetsloot et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics

 5,863
 views

 430
 downloads

 128
 citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.