For most people, effect size is a difficult concept to visualize. The most common form for reporting effect size is r (for correlation) or its equivalent, such as Phi (for Chi Square). This estimate of effect size runs from -1 to +1, corresponding to perfect positive and negative relationships respectively. For example, if income and education level are perfectly correlated, r=1. You could perfectly predict someone’s income based on their level of education. The higher the level of education, the greater the income. If they are perfectly negatively correlated, r=-1. The higher the education level, the lower the income. If education level has no effect on income, r=0. It’s the r values between -1 and 0 and between 0 and +1 that are hard to visualize. The most common way to talk about effect size using r is to report r2, or percent variance explained. In the example above, a correlation of .5, indicated that 25% of the variance in income can be explained by education level. In the social sciences, many of the r values for significant results are in the .2 to .3 range, explaining only 4% to 9% of the variance. This doesn’t sound particularly “significant” or meaningful.
Rosenthal and Rubin’s Binomial Effect Size Display (BESD)
The most intuitive effect size display is a contingency table of percentages. This is how results of opinion polls are generally presented. For example, you might see a report that 60% of men vote Republican and 60% of women for Democratic. In a world without Independents, the contingency table would look like this:
If you had surveyed 100 men and 100 women to get this result, you could test the significance of the finding by running a Chi Square on these figures and find that the significance level is p<.005. Given these percentages, the larger your sample, the lower (better) the p value. Looking at the table of percentages, the difference between men and women looks pretty “significant.” But if you translate this into the standard effect size measure of r2, or percent variance explained, you find that r=.2 or r2=.04, and gender has only explained 4% of the variance in party affiliation.
Working backwards from this, you can translate a correlation into a contingency table or Binomial Effect Size Display. This effectively, cuts both variables at the median point so that half of the observations fall into each column and half into each row. Take the example we talked about in an earlier blog, weight and height of men. We suggested that the correlation might be in the range of .5, corresponding to an r2 of .25, or 25% of the variance. The corresponding BESD table for this would look like this, assuming equal numbers of tall and short and equal numbers of light and heavy:
So, if you guessed, without looking at him, that a man was heavier than average because he was taller than average, you would be right 75% of the time. That sounds a lot better than saying that height accounts for 25% of the variance. The formula for filling in the percentages in the BESD table is: 50+/-50r. So, if there is no relationship between height and weight, r=0 and all four cells of the BESD table = 50. If you guess heavier based on the man being taller, you would only be right half the time. If r=1, the cells would be 100 for light/short and heavy/tall and 0 for light/tall and short/heavy. The following table show a range of r values and their corresponding “hit rates.” Of course, it can be argued that this inflates the perceived effect. After all, you’d be right 50% of the time by flipping a coin. It’s also worth checking out some more recent articles that argue that the BESD is less accurate as r diverges from .5, but for most purposes, the BESD can serve to give you a more intuitive feel for the meaningfulness of your finding.