<- file 94varnot.html -> ANOVA when variances are unequal? (1994).
  • Gallagher. What do you do in ANOVA when variances are unequal?
  • =======================Phil Gallagher, 2 Nov 1994==========ssc Message-ID: <STAT-L%94110220252648@VM1.MCGILL.CA> From: "Philip Gallagher,(919)929-6010" <UPHILG@UNCMVS.OIT.UNC.EDU> Subject: Re: variance - a different view > > > > What can you do if you want to do an Analysis of Variance, > > but not all variances are equal? When I have a problem like this I get peace of mind by stepping back from the textbook questions such as " ... are the variances equal ..." and try to think again about what I am really asking/wanting to find out. We memorized in STAT 1 that an ANOVA tests the equality of the cell means but in order to have a valid test the cell variances must be (nearly) equal. It was years later that I realized that what that is saying (in the most fortunate happenstance) is that if you were to overlay the distributions (standardized to equal Ns) of the k cells in a single plot, if they all have (nearly) the same location and and the same dispersion, the k distributions will lie one upon another. Or, that the distns of the variable in each of the k cells are (nearly) equal. So ANOVA is really a test not just of means equality, but of equality of distributions. (My soul is absolutely convinced that some black-hearted genius could readily conjure up data with equal means and variances but quite different shapes - (j) skewed left and (k-j) skewed right, for example, but, to my mind that is a pathological situation for ANOVA; one might be able to go away proclaiming equality of means but I would hate to have people thinking that the k cells were alike/similar, wouldn't you?) Anyway, now that I have the mental picture of k distributions piled up on each other, to me the question is no longer whether I am justified in doing an ANOVA, but, rather, are the k distributions (nearly) the same in both location and shape? It only takes one of those k distributions to have a different location, dispersion, or shape for me to want to say that I find it hard to believe they were all drawn from the same parent distribution. So I'm driven to abandoning ANOVA as soon as I suspect that the distributions are not equal. Typically, the first thing I do is overlay all the empirical distributions and perform the IOT test (IOT - Inter Ocular Trauma - it hits you right between the eyes). If it happens that at least one of the empirical distributions is grossly different from at least some of the others, then I am well along the path to being finished. Who cares about ANOVA? One (or more) of these cells is (are) different from the rest. Of course I may have a frightful time with p-values if I have to compute the D-statistic for every possible cell-pair, etc., but that's tough. It's also why I say that a scientist-statistician is almost never in a hypothesis testing situation - once you know enough so that you may legitimately design an experiment for an honest single a priori hypothesis test - the shapes of the distributions are equal, etc. - you know so much that few are willing to invest the resources just to get a single 100% defensible p-value. And, to try to answer the original question: If you are convinced that at least one of the k cell variances is different from the rest, STOP! No need to do the ANOVA - you are already convinced that the cells were not drawn from the same population. Now, if it were true that the ONLY thing you were interested in was whether the means differ and you really do not care about the shapes of the distributions, well, that's a different game, and I'd go haul out my copy of Bradley's book on nonparametric statistics to find the locations test that best suited my data. If you can find one that really is strictly a test of location. I'm always fearful that if I study a supposed test of location alone, that I will find that there's still a component of shape/dispersion in it, and that if I haven't found that component, it's because I didn't work hard enough. * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
  • Document by Rich Ulrich. E-mail to wpilib+@pitt.edu
  • FAQ top.
  • Ulrich home page.
  • Ulrich FAQ. http://www.pitt.edu/~wpilib/stats99.html