<- file stat 97reli.html -> Reliability cautions (1997) In this file - - Misused SAS: a Profile (across items) is not a proper ICC. - Pearson is simple for EXAMINING reliability.
  • Negative IC?
  • =======================Rich Ulrich, 07 Feb 1997==========ssc From: wpilib+@pitt.edu (Richard F Ulrich) Subject: Re: Negative intraclass correlations? Message-ID: <5dg2la$m4f@usenet.srv.cis.pitt.edu> Tom Bohman, Ph.D. (tmb@WILDER.ORG) wrote: : Greetings, : I'm trying to estimate the degree of agreement between mothers and : fathers in 172 families which I would like to use in further analyses. : Both parents responded to 14 items using a 4-point Likert response : scale. I'm using an intraclass correlation (ICC) (as opposed to a : Pearson correlation) to measure the agreement across the 14 items for : each set of parents. -- Do I read this right: You are computing something like a correlation (or like an ICC) across 14 items for each 2 parents? If so, I do not think that it is fair to call this any kind of 'correlation', but rather, it is a sort of profile coefficient, which is computed like a correlation - except the role of 'subjects' is not random, but is replaced by the fixed set of 14 items with arbitrarily different means. : I've downloaded the SAS macro developed by Robert : Hamer that implements the 6 types of ICCs identified by Shrout and : Fleiss in their Psychological Bulletin article. The SAS macro uses Proc : Glm to derive the variance components used to compute the particular : ICCs. : About 15 of the 172 ICCs are negative which I would like to understand : better since it doesn't seem possible to have a negative variance. In : fact, one of the negative ICCs is a value over 1 (-1.42). -- Since your data are laid out differently (I am pretty sure) than SAS ever expected to see for ICC data, and you have an ICC of -1.42, you are *almost* assuredly applying the formulas wrong. If you post the data that led to a negative corr., you will probably be given a corrected test. *--------
  • Why I like Pearson
  • =======================Rich Ulrich, 05 Mar 1997==========spss Subject: Re: interrater reliability Message-ID: <5fkpov$5r@usenet.srv.cis.pitt.edu> David A. Rowe (darowe@FRANK.MTSU.EDU) wrote: : On Mon, 3 Mar 1997, Michael Lacy wrote: : (from a previous message) : > >Hi. I have SPSS 5.0.2 for Windows. I have a number of 7-point Likert : scales. : > >There are two raters for all the measures. The raters tend to hover around : > I can't get a decent correlation with paired T's, Pearson or Kendall's. : (Michael's response) : > Pearson's r, for example, : > will tell you the extent to which one set of ratings are a linear function : > of the other, rather than identical to the other. Kappa is available DAR: : - in fact, for many, if not most, reliability situations, the Pearson : coefficient is not appropriate anyway. As an interclass coefficient, it : should not be used with two measures of the same variable (rather, it is : appropriate for estimating correlations between two different variables). Given ordinal variables, you certainly should *not* use the kappa which someone else suggested. Kappa ignores order. And kappas for tables bigger than 2x2 are not readily compared to any other size, so kappa is mainly for 2x2 comparisons. The Pearson coefficient is just FINE for looking at reliability between two scores with *equal means and variations*: then, it will exactly EQUAL an intraclass (consider, intrAclass vs intERclass) correlation. There is a tradition or habit of recommending intraclass correlations, but the assumption behind doing Intraclass seems (to me) to be that one is ignoring the differences that might exist between means or variances. Since I insist on looking at those, separately, I like to look at similarities using the same Pearson coefficient that everyone is used to from other contexts. The advantages of the Pearson over an Intraclass correlation, in fact, are several: You can look at it immediately as output of many programs; and there is only one version of it; and what you have is essentially orthogonal to the difference between raters that you should detect with the paired t-test. By contrast, you usually have to do some special computation to get your intraclass correlation, and there are three or four different formulas (which, by the way, are easy to mess up), a couple of varieties being grossly different; and what you get is NOT independent (statistically) of the difference between scores. DAR: : One of the family of intraclass coefficients should be used (I'm talking : generally, not as a solution to the problem posed in the original posting) The minor differences in the Family are whether you are assuming THESE raters, or 'random' raters. The big difference is whether you get an r that represents a SINGLE score, or the combined, average score for multiple raters. *--------FAQ note, March 1998 - for examining your data, I still recommend using Pearson correlations, as many as needed, between pairs of raters, pairs of diagnoses, etc.; and paired t-tests (or McNemar's test) should always be used with them, to check for differences in "level". For the limited purpose of summarizing for publication, after you are sure that there are no problems, then the Intraclass Correlation may be used as a combined indicator of what you have achieved. * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
  • Document by Rich Ulrich. E-mail to wpilib+@pitt.edu
  • FAQ top.
  • Ulrich home page.
  • Ulrich FAQ. http://www.pitt.edu/~wpilib/stats99.html