- file 94varnot.html ->
ANOVA when variances are unequal? (1994).
Gallagher. What do you do in ANOVA
when variances are unequal?
=======================Phil Gallagher, 2 Nov 1994==========ssc
Message-ID:
From: "Philip Gallagher,(919)929-6010"
Subject: Re: variance - a different view
> >
> > What can you do if you want to do an Analysis of Variance,
> > but not all variances are equal?
When I have a problem like this I get peace of mind by
stepping back from the textbook questions such as " ... are
the variances equal ..." and try to think again about what
I am really asking/wanting to find out. We memorized in
STAT 1 that an ANOVA tests the equality of the cell means
but in order to have a valid test the cell variances must
be (nearly) equal. It was years later that I realized that
what that is saying (in the most fortunate happenstance) is
that if you were to overlay the distributions (standardized to
equal Ns) of the k cells in a single plot, if they all have
(nearly) the same location and and the same dispersion, the
k distributions will lie one upon another. Or, that the
distns of the variable in each of the k cells are (nearly)
equal. So ANOVA is really a test not just of means equality,
but of equality of distributions. (My soul is absolutely
convinced that some black-hearted genius could readily conjure
up data with equal means and variances but quite different
shapes - (j) skewed left and (k-j) skewed right, for example,
but, to my mind that is a pathological situation for ANOVA;
one might be able to go away proclaiming equality of means
but I would hate to have people thinking that the k cells were
alike/similar, wouldn't you?)
Anyway, now that I have the mental picture of k distributions
piled up on each other, to me the question is no longer whether
I am justified in doing an ANOVA, but, rather, are the k
distributions (nearly) the same in both location and shape?
It only takes one of those k distributions to have a different
location, dispersion, or shape for me to want to say that I find
it hard to believe they were all drawn from the same parent
distribution. So I'm driven to abandoning ANOVA as soon as I
suspect that the distributions are not equal. Typically, the first
thing I do is overlay all the empirical distributions and
perform the IOT test (IOT - Inter Ocular Trauma - it hits you
right between the eyes). If it happens that at least one of
the empirical distributions is grossly different from at least
some of the others, then I am well along the path to being
finished. Who cares about ANOVA? One (or more) of these cells
is (are) different from the rest. Of course I may have a frightful
time with p-values if I have to compute the D-statistic for
every possible cell-pair, etc., but that's tough. It's also
why I say that a scientist-statistician is almost never in a
hypothesis testing situation - once you know enough so that
you may legitimately design an experiment for an honest
single a priori hypothesis test - the shapes of the distributions
are equal, etc. - you know so much that few are willing to invest
the resources just to get a single 100% defensible p-value.
And, to try to answer the original question: If you are convinced
that at least one of the k cell variances is different from the
rest, STOP! No need to do the ANOVA - you are already convinced
that the cells were not drawn from the same population.
Now, if it were true that the ONLY thing you were interested
in was whether the means differ and you really do not care
about the shapes of the distributions, well, that's a different
game, and I'd go haul out my copy of Bradley's book on
nonparametric statistics to find the locations test that best
suited my data. If you can find one that really is strictly
a test of location. I'm always fearful that if I study a
supposed test of location alone, that I will find that there's
still a component of shape/dispersion in it, and that if I
haven't found that component, it's because I didn't work
hard enough.
* * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Document by Rich Ulrich. E-mail to wpilib+@pitt.edu
FAQ top.
Ulrich home page.
Ulrich FAQ.
http://www.pitt.edu/~wpilib/stats99.html