Here is one posting with references, and a response to that, with further commentary and references. The subject is "compositional analyses", i.e., what to do when the categories add to 100%, and information about the ratio of A to B, for instance, may be more important than the absolute levels of A or B. =====================Steve Cumming, 09 Oct 1995========ssc From: stevec@geog.ubc.ca (Steve Cumming) Subject: Re: Testing for differences in proportions Message-ID: <45ca3n$rrq@nntp.ucs.ubc.ca> >> ...snip >I have seen a lot of individuals who have posted help for Maria on this >problem. I am sure that they have intended to help her with her problem. And goes on to recommend Aitchison's book, the Statistical Analysis of Compositional Data, singing the good professor's praises. To which I add, hallelula and amen. Not least among the books virtues is that it succeeds in explaining non-trivial multivariate methods so that I almost feel I undertstand them. I must take issue with Brian on one point though. Compositional data are not rare, at least in landscape ecology (my discipline), other areas of ecological enquiry, geo-chemistry, food science (evidently) and gawd knows what else. I'm appending a fairly complete bibliography of Aitchison's work, and a number of recent forward references from the Ecological literature: @string{jrssb = "J. Roy. Stat. Soc. Ser. B"} % Math Library @string{mathgeol = "Math. Geol."} % Mathematical Biology, Main Stacks @string{japecol = "J.\ Appl.\ Ecol."} % Journal of Applied Ecology, Woodward @Article{aitchison82, author = "J. Aitchison", title = "The statistical analysis of compositional data", journal = jrssb, year = 1982, volume = 44, number = 2, pages = "139-177", annote = "With discussion." } @Article{aitchison84a, author = "J. Aitchison", title = "The statistical analysis of geochemical compositions", journal = mathgeol, year = 1984, volume = 16, number = 6, pages = "531-564" } @Article{aitchison84b, author = "J. Aitchison and S. M. Shen", title = "Measurement Error in Compositional Data", journal = mathgeol, year = 1984, volume = 16, number = 6, pages = "637-650" } @Article{aitchison84c, author = "J, Aitchison", title = "Reducing the Dimensionality of Compositional Data Sets", journal = mathgeol, year = 1984, volume = 16, number = 6, pages = "617-635" } @Book{aitchison86, author = "J. Aitchison", title = "The Statistical Annalysis of Compositional Data", publisher = "Chapman and Hall", year = 1986, series = "Monographs on Statistics and Applied Probability", address = "London", annote = "A greatly expanded version of the original 1982 paper, with lots of examples of hypothesis testing" } @Article{aitchison92, author = "J. Aitchison", title = "On Criteria of Measures of Compositional Difference", journal = mathgeol, year = 1992, volume = 24, number = 4, pages = "365-379" } % Some selected applications from the recent literature % follow @Article{aebischer93, author = "N. J. Aebischer and P. A. Robertson and R. E. Kenward", title = "Compositional Analysis of habitat use from animal radio-trackng data", journal = "Ecology", year = 1993, volume = 74, number = 5, pages = "1313-1325", annote = "I'm sending this to Kim' } @Article{robertson93, author = "P. A. Robertson and M. I. A. Woodburn and W. Neutel and C. E. Bealey", title = "Effects of land use on breeding pheasant density", OPTcrossref = "", OPTkey = "", journal = japecol, year = "1993", volume = "30", pages = "465-477" } @Article{clements91, author = "A.-M. Clements and M. C. Jones", title = "An ecological exampled of the application of projection pursuit to compositional data", journal = "Vegetatio", year = "1991", volume = "95", pages = "101-107", annote = "An interesting but unsuccessful attempt to relate vegetation and soils patterns in New South Wales using the latest neato-keen methods" } @Article{hermy91, author = "M. Hermy and P. J. Lewi", title = "Multivariate ratio analysis, a graphical method for ecological ordination", journal = "Ecology", year = 1991, volume = 72, number = 2, pages = "735-738" } @Article{rayens91 author = "W. S. Rayens and C. Srinivasan,", title = "Box-{C}ox transformations in the analysis of compositional data", journal = "J. Chemometrics", year = 1991, volume = 5, pages = "227-239", annote = "Generalise Aitchison's transform to improve multi-variate normality in some cases. Also discuss MLE methods for estimation of confidence for the 'true, unknown compositional constituents'." } %% Science Citation Index (That I have'nt looked up yet) % % Quaternary Research 41 70 1994 % Oecologia 86 147 1991 % Behavioural Ecology 26 139 1990 % J Roy Stat Soc A 157 231 1994 % Biometrika 79 57 1983 QH 301 B5 % Ibis 136 39 1994 Woodward: treats method as routine %% Silver Platter: search on Compositional near1 data % % Can. J. Plant Science 74(3) Mac S1 C35 % Sylvae Genetica 42(6) % Sylvae Genetica 39 p 173 % Anatomical Record 240(4) 625-31 Woodward QL 801 A45 %% Further references abound in the Geological, medical, horticultural %% and food science (ick!) literature ________________________________________________________________ ...more on Compositional analysis. Watson. =====================Dave Watson, 12 Oct 1995========ssm,sgg From: watson@madvax (Dave Watson) Newsgroups: sci.geo.geology,sci.stat.math Subject: Re: (Wrong) Statistical Analysis of Compositional Data Message-ID: <45hre5$ck5@styx.uwa.edu.au> Steve Cumming (stevec@geog.ubc.ca) wrote: : Actually, as far as I can gather, any good stats package should : be able to do what you want, for example SAS. The only trick is : doing the log-ratio transforms. I'm written a C-language utility to : do this, and am working on adding multivariate normality tests. : You can have a copy, if you wish. So can any one else, by : writing me. This is a misunderstanding. Compositional data, because it is composed of mutually dependent components, only should be thought of as "directional" data. The chemical, or mineral, composition of a rock is a set of proportions - although we measure the magnitudes of the components, it is a grave mistake to consider the magnitude of a component, or the sum of the magnitudes of the components, as being a significant characteristic of a composition. Only the relative magnitudes, that is, proportions, of the components are relevant. Those proportions define a direction or, if you like, a vector with an undefined length. This means that the difference between two compositions is an angle. And that in turn means that treating compositional data on the sum-to-one plane, or a ternary diagram, introduces significant errors. Anyone who cares to, can see these errors for themselves with a sketch on the back of an envelope. Consider two pairs of compositions, where each pair has the same small angular difference but one pair is near the center of the ternary diagram while the other pair is near a vertex. Your sketch will show you that the distances between mates of a pair are not the same. This contradicts the precondition that each pair has the same angular difference. The reason is simple. An angle is a distance in spherical space. Distance in linear space, or any other space including log-ratio space, is not the same as distance in spherical space. A fixed distance in spherical space will not be invariant when projected onto an other space. When a set of compositions are analysed on the sum-to-one plane, any intention to find clusters, classifications, density contours, or discriminary criteria of any sort, will be confounded by the errors introduced by the initial projection onto that plane. No amount of correctional transformations, including log-ratio, will properly compensate that introduced distortion. If you want the right answer, you must treat your compositional data in spherical space. For example, over the last 25 years or more a large number of very complicated programs have been written to translate a chemical or oxide composition into the appropriate mineral composition and so identify the rock that was chemically analysed. But this problem is simple in spherical space - the correct proportions of the appropriate minerals, and only those, are directly specified by the spherical natural neighbor coordinates of the chemical composition taken with respect to the set of all minerals. Philip, G.M. and Watson, D.F., 1988, Determining the representative composition of a set of sandstone samples, Geol. Mag., 125(3), 267-272. Philip, G.M. and Watson, D.F., 1988, Angles measure compositional differences, Geology, 16, 976-979. Philip, G.M. and Watson, D.F., 1989, Some geometric aspects of the ternary diagram, J. Geol. Education, 37(1), 27-29. Watson, D.F. and Philip, G.M., 1989, Measures of variability for geological data, Math. Geol., 21(2), 233-254. Watson, D.F., 1988, Natural neighbor sorting on the n-dimensional sphere, Pattern Recognition, 21(1), 63-67. Traditional and conventional statistical procedures cannot adequately treat compositional data. Means, and higher moments, were designed and intended to treat replicate data - that is, repeated observations of INDEPENDENT variables and have been extended to apply to sets of independent events. Applying these procedures to sets of independent events with associated dependent variables provides radically different result which is surficial rather than statistical. Again, applying these procedures to sets of dependent variables provides another radically different results. See the diagrams in Watson and Philip, 1989, which display the same set of numbers after different independent/dependent assumptions. John Aitchison has vehemently disputed these conclusions. But although he readily agrees that the differences between compositions are angles, his refutation depends upon unsubstantiated assertions, derogatory innuendo, and ridicule. This is understandable because who could provide a reasoned and logical denial of the evidence given by the back-of-an-envelope sketch. Aitchison, J., 1990, Comments on " Measures of variability for geological data", Math. Geol., 22(2), 223-226. Aitchison, J., 1991, Delusions of uniqueness and ineluctability, Math., Geol., 23(2), 275-77. Aitchison, J., 1992, On criteria for measures of compositional difference, Math. Geol., 24(4), 365-379. --
  • Document by Rich Ulrich. E-mail to wpilib+@pitt.edu
  • FAQ top.
  • Ulrich home page.
  • Ulrich FAQ. http://www.pitt.edu/~wpilib/stats99.html