Standardized Mean Differences

The calculation of Cohen’s d type effect sizes

Aaron R. Caldwell

2022-03-22

The calculation of standardized mean differences (SMDs) can be helpful in interpreting data and are essential for meta-analysis. In psychology, effect sizes are very often reported as an SMD rather than raw units (though either is fine: see Caldwell and Vigotsky (2020)). In most papers the SMD is reported as Cohen’s d. The simplest form involves reporting the mean difference (or mean in the case of a one-sample test) divided by the standard deviation.

\[ Cohen's \space d = \frac{Mean}{SD} \]

However, two major problems arise: bias and the calculation of the denominator. First, the Cohen’s d calculation is biased (meaning the effect is inflated), and a bias correction (often referred to as Hedges’ g) is applied to provide an unbiased estimate. Second, the denominator can influence the estimate of the SMD, and there are a multitude of choices for how to calculate the denominator. To make matters worse, the calculation (in most cases an approximation) of the confidence intervals involves the noncentral t distribution. This requires calculating a non-centrality parameter (lambda: \(\lambda\)), degrees of freedom (\(df\)), or even the standard error (sigma: \(\sigma\)) for the SMD. None of these are easy to determine and these calculations are hotly debated in the statistics literature (Cousineau and Goulet-Pelletier 2021).

In this package we have opted to the (mostly) use the approaches outlined by Goulet-Pelletier and Cousineau (2018). We found that that these calculations were simple to implement and provided fairly accurate coverage for the confidence intervals for any type of SMD (independent, paired, or one sample). However, even the authors have outlined some issues with the method in a newer publication (Cousineau and Goulet-Pelletier 2021). It is my belief that SMDs provide another interesting description of the sample, and have very limited inferential utility (though exceptions apply). You may disagree, and if you are basing your inferences on the SMD, and the associated confidence intervals, we recommend you go with a boostrapping approach (see boot_t_TOST) (Kirby and Gerlanc 2013).

In this section we will detail on the calculations that are involved in calculating the SMD, their associated degrees of freedom, noncentrality parameter, and variance. If these SMDs are being reported in a scientific manuscript, we strongly recommend that the formulas for the SMDs you report be included in the methods section.

Bias Correction (Hedges)

For all SMD calculative approaches the bias correction was calculated as the following:

\[ J = \frac{\Gamma(\frac{df}{2})}{\sqrt{\frac{df}{2}} \cdot \Gamma(\frac{df-1}{2})} \]

The correction factor is calculated in R as the following:

J <- exp ( lgamma(df/2) - log(sqrt(df/2)) - lgamma((df-1)/2) )

This calculation was derived from the supplementary material of Cousineau and Goulet-Pelletier (2021).

Hedges g (bias corrected Cohen’s d) can then be calculated by multiplying d by J

\[ g = d \cdot J \]

Independent Samples

For independent samples there are two calculative approaches supported by TOSTER. One the denominator is the pooled standard deviation (Cohen’s d) and the other is the average standard deviation (Cohen’s d(av)). Currently, the choice is made by the function based on whether or not the variance can assumed to be equal. If the variances are not assumed to be equal then Cohen’s d(av) will be returned, and if variances are assumed to be equal then Cohen’s d is returned.

Variances Assumed Unequal: Cohen’s d(av)

For this calculation, the denominator is simply the square root of the average variance.

\[ s_{av} = \sqrt \frac {s_{1}^2 + s_{2}^2}{2} \]

The SMD, Cohen’s d(av), is then calculated as the following:

\[ d_{av} = \frac {\bar{x}_1 - \bar{x}_2} {s_{av}} \] Note: the x with the bar above it (pronounced as “x-bar”) refers the the means of group 1 and 2 respectively.

The degrees of freedom for Cohen’s d(av), derived from Delacre et al. (2021), is the following:

\[ df = \frac{(n_1-1)(n_2-1)(s_1^2+s_2^2)^2}{(n_2-1) \cdot s_1^4+(n_1-1) \cdot s_2^4} \]

The non-centrality parameter (\(\lambda\)) is calculated as the following:

\[ \lambda = d_{av} \times \sqrt{\frac{n_1 \cdot n_2(\sigma^2_1+\sigma^2_2)}{2 \cdot (n_2 \cdot \sigma^2_1+n_1 \cdot \sigma^2_2)}} \]

The standard error (\(\sigma\)) of Cohen’s d(av) is calculated as the following:

\[ \sigma = \sqrt{\frac{df}{df-2} \cdot \frac{2}{\tilde n} (1+d^2 \cdot \frac{\tilde n}{2}) -\frac{d^2}{J^2}} \] wherein \(J\) represents the Hedges correction (calculation above).

Variances Assumed Equal: Cohen’s d

For this calculation, the denominator is simply the pooled standard deviation.

\[s_{p} = \sqrt \frac {(n_{1} - 1)s_{1}^2 + (n_{2} - 1)s_{2}^2}{n_{1} + n_{2} - 2}\]

\[ d = \frac {\bar{x}_1 - \bar{x}_2} {s_{p}} \]

The degrees of freedom for Cohen’s d is the following:

\[ df = n_1 + n_2 - 2 \]

The non-centrality parameter (\(\lambda\)) is calculated as the following:

\[ \lambda = d \cdot \sqrt \frac{\tilde n}{2} \] wherein, \(\tilde n\) is the harmonic mean of the 2 sample sizes which is calculated as the following:

\[ \tilde n = \frac{2 \cdot n_1 \cdot n_2}{n_1 + n_2} \]

The standard error (\(\sigma\)) of Cohen’s d is calculated as the following:

\[ \sigma = \sqrt{\frac{df}{df-2} \cdot \frac{2}{\tilde n} (1+d^2 \cdot \frac{\tilde n}{2}) -\frac{d^2}{J}} \] wherein \(J\) represents the Hedges correction (calculation above).

Paired Samples

For paired samples there are two calculative approaches supported by TOSTER. One the denominator is the standard deviation of the change score (Cohen’s d(z)) and the other is the correlation corrected effect size (Cohen’s d(av)). Currently, the choice is made by the function based on whether or not the user sets rm_correction to TRUE. If rm_correction is set to t TRUE then Cohen’s d(rm) will be returned, and otherwise Cohen’s d(z) is returned.

Cohen’s d(z): Change Scores

For this calculation, the denominator is the standard deviation of the difference scores which can be calculated from the standard deviations of the samples and the correlation between the paired samples.

\[ s_{diff} = \sqrt{sd_1^2 + sd_2^2 - 2 \cdot r_{12} \cdot sd_1 \cdot sd_2} \]

The SMD, Cohen’s d(z), is then calculated as the following:

\[ d_{z} = \frac {\bar{x}_1 - \bar{x}_2} {s_{diff}} \]

The degrees of freedom for Cohen’s d(z) is the following:

\[ df = 2 \cdot (N_{pairs}-1) \]

The non-centrality parameter (\(\lambda\)) is calculated as the following:

\[ \lambda = d_{z} \cdot \sqrt \frac{N_{pairs}}{2 \cdot (1-r_{12})} \]

The standard error (\(\sigma\)) of Cohen’s d(z) is calculated as the following:

\[ \sigma = \sqrt{\frac{df}{df-2} \cdot \frac{2 \cdot (1-r_{12})}{n} \cdot (1+d^2 \cdot \frac{n}{2 \cdot (1-r_{12})}) -\frac{d^2}{J^2}} \space \times \space \sqrt {2 \cdot (1-r_{12})} \]

Cohen’s d(rm): Correlation Corrected

For this calculation, the same values for the same calculations above is adjusted for the correlation between measures. As Goulet-Pelletier and Cousineau (2018) mention, this is useful for when effect sizes are being compared for studies that involve between and within subjects designs.

First, the standard deviation of the difference scores are calculated

\[ s_{diff} = \sqrt{sd_1^2 + sd_2^2 - 2 \cdot r_{12} \cdot sd_1 \cdot sd_2} \]

The SMD, Cohen’s d(rm), is then calculated with a small change to the denominator:

\[ d_{rm} = \frac {\bar{x}_1 - \bar{x}_2} {s_{diff} \cdot \sqrt {2 \cdot (1-r_{12})} } \]

The degrees of freedom for Cohen’s d(rm) is the following:

\[ df = 2 \cdot (N_{pairs}-1) \]

The non-centrality parameter (\(\lambda\)) is calculated as the following:

\[ \lambda = d_{rm} \cdot \sqrt \frac{N_{pairs}}{2 \cdot (1-r_{12})} \]

The standard error (\(\sigma\)) of Cohen’s d(rm) is calculated as the following:

\[ \sigma = \sqrt{\frac{df}{df-2} \cdot \frac{2 \cdot (1-r_{12})}{n} \cdot (1+d^2 \cdot \frac{n}{2 \cdot (1-r_{12})}) -\frac{d^2}{J^2}} \]

One Sample

For a one-sample situation, the calculations are very straight forward

For this calculation, the denominator is simply the standard deviation of the sample.

\[s={\sqrt {{\frac {1}{N-1}}\sum _{i=1}^{N}\left(x_{i}-{\bar {x}}\right)^{2}}}\] The SMD is then the mean of X divided by the standard deviation.

\[ d = \frac {\bar{x}} {s} \]

The degrees of freedom for Cohen’s d is the following:

\[ df = N - 1 \]

The non-centrality parameter (\(\lambda\)) is calculated as the following:

\[ \lambda = d \cdot \sqrt N \]

The standard error (\(\sigma\)) of Cohen’s d is calculated as the following:

\[ \sigma = \sqrt{\frac{df}{df-2} \cdot \frac{1}{N} (1+d^2 \cdot N) -\frac{d^2}{J^2}} \] wherein \(J\) represents the Hedges correction (calculation above).

Confidence Intervals

For the SMDs calculated in this package we use the non-central t method outlined by Goulet-Pelletier and Cousineau (2018). These calculations are only approximations and newer formulations may provide better coverage (Cousineau and Goulet-Pelletier 2021). In any case, if the calculation of confidence intervals for SMDs is of the utmost importance then I would strongly recommend using bootstrapping techniques rather than any calculative approach whenever possible (Kirby and Gerlanc 2013).

The calculations of the confidence intervals in this package involve a two step process: 1) using the noncentral t-distribution to calculate the lower and upper bounds of \(\lambda\), and 2) transforming this back to the effect size estimate.

Calculate confidence intervals around \(\lambda\).

\[ t_L = t_{(1-2-(1-\alpha)/2,\space df, \space \lambda)} \\ t_U = t_{(1-2+(1-\alpha)/2,\space df, \space \lambda)} \]

Then transform back to the SMD.

\[ d_L = \frac{t_L}{\lambda} \cdot d \\ d_U = \frac{t_U}{\lambda} \cdot d \]

Plotting SMDs

Sometimes you may take a different approach to calculating the SMD, or you may only have the summary statistics from another study. For this reason, I have included a way to plot the SMD based on just three values: the estimate of the SMD, the degrees of freedom, and the non-centrality parameter. So long as all three are reported, or can be estimated, then a plot of the SMD can be produced.

Two types of plots can be produced: consonance (type = "c"), consonance density (type = "cd"), or both (the default option; (type = c("c","cd")))

plot_smd(d = .43,
         df = 58,
         lambda = 1.66,
         smd_label = "Cohen's d"
         )

References

Caldwell, Aaron, and Andrew D. Vigotsky. 2020. “A Case Against Default Effect Sizes in Sport and Exercise Science.” PeerJ 8 (November): e10314. https://doi.org/10.7717/peerj.10314.
Cousineau, Denis, and Jean-Christophe Goulet-Pelletier. 2021. “A Study of Confidence Intervals for Cohens Dp in Within-Subject Designs with New Proposals.” The Quantitative Methods for Psychology 17 (1): 51–75. https://doi.org/10.20982/tqmp.17.1.p051.
Delacre, Marie, Daniel Lakens, Christophe Ley, Limin Liu, and Christophe Leys. 2021. “Why Hedges’ gs Based on the Non-Pooled Standard Deviation Should Be Reported with Welch’s t-Test,” May. https://doi.org/10.31234/osf.io/tu6mp.
Goulet-Pelletier, Jean-Christophe, and Denis Cousineau. 2018. “A Review of Effect Sizes and Their Confidence Intervals, Part i: The Cohen’s d Family.” The Quantitative Methods for Psychology 14 (4): 242–65. https://doi.org/10.20982/tqmp.14.4.p242.
Kirby, Kris N., and Daniel Gerlanc. 2013. BootES: An r Package for Bootstrap Confidence Intervals on Effect Sizes.” Behavior Research Methods 45 (4): 905–27. https://doi.org/10.3758/s13428-013-0330-5.