<- file stat .html -> Skewness and outliers (1997) This file has general overviews of skewness and outliers. Transformations and nonparametrics are related topics. Most of the notes were posted by me, but there are extensive citations of earlier notes. Which power transformation? Robustness of normal. REFs Moore Power transformations. What's a "heavy tail"? Outliers and fat tails? What is "extreme" skewness? *----------------
  • Which power transformation?
  • =======================Rich Ulrich, 27 Jan 1997==========ssc Subject: Re: Which transfomation? Message-ID: <5cjco3$ldm@usenet.srv.cis.pitt.edu> Mark Myatt (mark@myatt.demon.co.uk) wrote: : Albert Craig <craig@uquebec.ca> writes: : >Can anyone help me find a test or guide/book that will : >enable me to choose the appropriate transformation in order : >to be able to use parametric stats on my data. The major : >problem is the heterogeneity of the variances (lack of : >homoscedasticity) : Here is a table from my book "Analysing Data" (Brixton Books, ISBN 1- : 873937-46-6, sorry for the plug!) but you'll find something similar in : most applied stats books: : Problem Severity / Nature Transformation : ----------------- ----------------------------- -------------- : +ve skew severe 1/x : moderate log(x) : slight sqr(x) -- I would like to note that if this is considered strictly true, it must be a matter of DEFINITION or tautology -- that is, "severe skewness" is "skewness which is mended by the 1/x transformation." I discovered when doing some simulations that having a coefficient of kurtosis of .7 (in my example) did not hurt my t-test, if the underlying transformation needed was the square root, whereas it DID matter if the transformation was the logarithm. How did I know what was NEEDED? - easy, because I started with normal samples, and transformed AWAY from normal. If you take a N(3,1) and square it, you get something with much higher kurtosis than if you take N(10,1) and square it; or N(3,0.3) and square it. If you take N(0,1) and exponentiate, you get a lot higher kurtosis than if you start with N(0,0.1). My usual approach is to look at the extremes, the quartiles and the median, and see which transformation makes them most symmetric. *--------
  • Robustness of normal. REFs
  • =======================David S Moore, 26 Mar 1997==========sse From: dsm@b.stat.purdue.edu (David S. Moore) Subject: Re: sensitivity of normal theory methods Message-ID: <5hbscv$21rk@b.stat.purdue.edu> A compact discussion of the robustness of normal theory methods for inference about _means_ and the lack of robustness of normal theory methods for inference about _spread_ appears (with references) in Chapter 7 of Moore and McCabe Introduction to the Practice of Statistics. Here is a partial list of simulation studies: E. S. Pearson and N. W. Please, Relation between the shape of population distribution and the robustness of four simple test statistics, Biometrika, 62 (1975), 223--241. Posten, H. O. (1978) The robustness of the two-sample t-test over the Pearson system, J. of Statistical Computation and Simulation 6, 295-311 Posten, H. O. (1979) The robustness of the one-sample t-test over the Pearson system, J. of Statistical Computation and Simulation 9, 133-149 Posten, H. O., Yeh, H. and Owen, D.B. (1982) Robustness of the two-sample t-test under violations of the homogeneity assump- tion, Communications in Statistics 11, 109-126. Posten, H. O. (1982) Small sample power of the Wilcoxon test over the Pearson system, and comparison with the t-test, J. of Statis- tical Computation and Simulation 16, 1-18. Posten, H. O. (1984) The Robustness of the t-test, Proceedings of the Conference on Statistical Robustness and Nonparametric Statistics Schwerin, East Germany, May, 1983. VEB Deutscher Verlag der Wissenschaften, Berlin, pp. 92-99. Posten, H.O. (1992) Robustness of the Two-Sample t-test Under Violations of the Assumption of Homogeneity Assumption, Part II. J. Comp. & Simul. 8, 2169-2184. David Moore *--------
  • Power transformations
  • =======================Rich Ulrich, 29 Jan 1997==========ssc Subject: Re: Which transfomation? Message-ID: <5cnp8a$aae@usenet.srv.cis.pitt.edu> Mark Myatt (mark@myatt.demon.co.uk) wrote: : Richard F Ulrich <wpilib+@pitt.edu> writes: : > -- I would like to note that if this is considered strictly true, it : >must be a matter of DEFINITION or tautology -- that is, "severe : >skewness" is "skewness which is mended by the 1/x transformation." : You are right. These are (only) rules of thumb but like most heuristics : are useful. It's not quite that "tautological" as in the ordered : response to skew "severe" means skewness NOT fixed by a log or square : root transformation. - I like that phrase, "the ordered response to skew". And it inspires further thought and comment: - On the one hand, "all things being equal", we should use the least severe transformation that does the job (in practice, there is often no doubt, but this is an abstract point). - On the other hand, if the reciprocol changes Frequency to Wavelength, is it really a "transformation" or is it a re-parameterization? And, the square root is milder than the log transform, but the latter is easier to talk about, and is a frequently a "natural" candidate, when biological or chemical processes are involved. So, "all things" are not always equal. : >My usual approach is to look at the extremes, the quartiles and the : >median, and see which transformation makes them most symmetric. : This is better than the application of blind rules. : Best wishes, *--------
  • What's a heavy tail?
  • =======================Rich Ulrich, 1 Jun 1997==========sse Subject: Re: What's a heavy tail? Message-ID: <5msl6c$ral@usenet.srv.cis.pitt.edu> Ursula Kellett (kellett@pigeon.qut.edu.au) wrote: : I have read some times that a heavy tail(s) is not so good when it comes : to inference. But what is a 'heavy' tail? : I have also read about 'long', 'thick', and 'high' tails! : My presumption is that _any_ shape qualifies as being 'heavy' if it : increases the alpha probability at a given critical value compared to the : reference non-heavy tailed distribution. -- As a practical matter, I would say that your criterion is a "sufficient" one, but it is not a "necessary" one. That is, if it increases the alpha, then that qualifies it as heavy. But if it decreases the alpha, THAT would also qualify it as heavy; for instance, the t-distribution tends to have short tails, when the data have long tails, for unequal Ns. : Is this sounds ok, then it follows that a more powerful test is performed : than would otherwise have occurred??? -- If the effective alpha is increased, then the direction of error is to make the test "more powerful" in the sense that it will reject more often. (I prefer to use the term "more power" when doing comparisons between two tests that remain valid, where "not valid" is what we call a test that is larger than its purported size.) -- Here is one way to get fat tails from Normal distributions. Consider the mixture of two Normals, where N1=90% or so has variance of 1, and N2=10% has a variance of 10; or 100. Whatever you get as the estimate of the variance of the mixture, the "VARIANCE of the variance" will be larger than you would expect for the total N - In effect, the sample acts as if the 10% WERE the total sample, when it comes to degrees of freedom. In the same way, even one or two EXTREME outliers can run the variance- properties of ANY sample, making it unfit for ANOVA or correlation, etc., no matter how large the total N. *--------
  • Outliers and fat tails
  • =======================Rich Ulrich, 9 May 1997==========ssc,ssm Subject: Re: Testing For Normal Distribution Message-ID: <5l04f6$j86@usenet.srv.cis.pitt.edu> Paul Velleman (pfv2@cornell.edu) wrote: : In article <5ktgvs$a67@usenet.srv.cis.pitt.edu>, (Richard : F Ulrich) wrote: : > I do remember a consultee who liked his data as it was, because it was : > very wonderfully normal, except for just 1 or 2 really, really, : > extreme data values. : > : > So here is a clue: If your results change much when you drop an : > extreme point or two, then you do NOT have a "moderate deviation : > from normality." (This still means you want to define when "results" : I also agree with Clay. But I dont agree with Richard's drift here. If you : have normal data with a few outliers, separate the outliers and analyze the : rest of the data accepting the normality assumption. Then look at the : outliers in terms of the models you fit to the bulk of the data. Never let : an occasional outlier dictate anything about the distribution. : Distributions are about the mass of the data. Outliers are often evidence : of inhomogeneity -- that is, they come from some other process and should : be dealt with separately. -- and I further agree with Paul - there are SEVERAL good ways to deal with outliers. I hope I can clarify my "drift" which drifted too broadly. My point was the narrower one, there is one BAD way to deal with outliers, and that is to fold them into your parametric analysis; there are other kinds of non-normal, but the Outlier is a sneaky one which is sometimes overlooked. IT takes experience to make that decision that Paul finds automatic - remove the extreme outlier. Or transform it, do SOMETHING with it. I *have* had the experience where my advisee wanted to do a his tests with one or two extremes still in there - the data, in his view, was "nearly normal" because there were only a couple of data values that anyone might object to. Since 99% of his points were OK, he figured the rest could be forgiven, I guess.... which is not quite how it works. And Clay's advice accidentally, wrongly, implies that "large n" is a panacea - yes, large n is good, but *not* a cure-all. No matter how large the n, one huge data value CAN do things like double the Variance, or move the average by a ridiculous amount. *--------
  • What is "extreme" skew ...?
  • =======================Rich Ulrich, 07 Mar 1997==========spss Subject: Re: a question about those small unnoticeable indices (skewness, kurtosis. Kaiser-Meyer-Olkin, etc.) Message-ID: <5fpe3e$iab@usenet.srv.cis.pitt.edu> Amir Hetsroni (amiron@michtec.macam98.ac.il) wrote: : Can any of the pros around can give me a good estimation what is the : highest rate of kurtosis and skeweness parametric tests and central : tendency indices can live with intact more or less? -- With variables that are dichotomies or just a few scale points, the measured kurtosis and skewness do not say much. I haven't worried about the measures of kurtosis; I do look for outliers, but they are usually on one end of the scale. With continuous variables, samples of 75 or so, I have found that skewness less than .3 never worries me. For the same measured skewness, the effect on inference will be WORSE if taking logs would produce symmetry, rather than just taking square roots. I wish I could say more, but I just note that you do ask your question precisely, concerning "parametric tests" and "central tendency indices." Having *one* extreme outlier can mess up parametric tests, and can move an average far from the mode or media -- which is what is often meant by "central tendency." I would be interested in hearing of rules-of-thumb, for when one MUST trim/transform/drop one (or more) outlier, for ANOVA-test purposes. << faq snip >> * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
  • Document by Rich Ulrich. E-mail to wpilib+@pitt.edu
  • FAQ top.
  • Ulrich home page.
  • Ulrich FAQ. http://www.pitt.edu/~wpilib/stats99.html