<- file stat .html -> FAQ - Chap. 1, basics *************** Definitions, sources, basics ****************
  • Data entry
  • ========================Ralph Brands, 26 May 1996=======(spss) From: brinton@unixg.ubc.ca (Ralph Brands) Subject: Re: data entry programs Message-ID: <brinton-2605960952090001@port11.annex2.net.ubc.ca> Having tried many strategies through the years, we've settled on FoxPro. We bought it only for its dataentry capabilities. You can make input screens, move fields around, set relations to other screens when you're entering 750 fields etc very quickly and with minimal reference to manuals. We're NOT programmers and we can use it. One of the nicest features is the XBASE default of moving to the next field without having to hit "entry" or "tab" when the current field width is exceeded. So if you have a yes/no coded with 1/2, when either of these is entered, the program moves to the next field (if the field width is 1). This simple thing is a programming task or impossible in other programs we've used through the years (Oracle Power Objects, tcl/tk etc). Cost is $99. Programs are compatible across platforms: you can make a Windows program on a Mac and vice-versa. The downside: you are subsidizing Microsoft if you buy it. * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
  • Data mining? a text. (see Stepwise, for generic warnings.)
  • ... Various Web address I have looked for on the subject of "data mining" have been missing when I looked. ==================Gregory Piatetsky-Shapiro, 1 Mar 1996 ========ssm,cdt From: gps0@harvey (Gregory Piatetsky-Shapiro) Newsgroups: sci.stat.math,comp.databases.theory Subject: New Book: Advances in Knowledge Discovery and Data Mining Message-ID: <4ia8bc$2f6@ceylon.gte.com> New Book Announcement: Advances in Knowledge Discovery and Data Mining ----------------------------------------------- Edited by Usama M. Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth, and Ramasamy Uthurusamy Published by the AAAI Press / The MIT Press ISBN 0-262-56097-6 March 1996 625 pp. Price: $ 50.00 This book can be ordered online from The MIT Press: http://mitpress.mit.edu/ More info at: http://www-mitpress.mit.edu/mitp/recent-books/comp/fayap.html http://www.aaai.org/Publications/Press/Catalog/fayyad.html (This AAAI website also has abstracts of chapters) ---------------------------------------------------------------------------- "Advances in Knowledge Discovery and Data Mining" brings together the latest research -- in statistics, databases, machine learning, and artificial intelligence -- that are part of the exciting and rapidly growing field of Knowledge Discovery and Data Mining. Topics covered include fundamental issues, classification and clustering, trend and deviation analysis, dependency modeling, integrated discovery systems, next generation database systems, and application case studies. The contributors include leading researchers and practitioners from academia, government laboratories, and private industry. Gregory Piatetsky-Shapiro email: gps@gte.com * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
  • On-line data sources. Shoenfield. (See other Stat Web pages, generally.)
  • =====================Michael Schoenfield, 25 Jan 1996========(spss) Message-ID: <199601260523.XAA24919@execpc.com> From: "Michael A. Schoenfield" <maschoen@EARTH.EXECPC.COM> Subject: Re: Data Sites on the Net A number of persons (scientists and other searchers for truth and data :-) had expressed interest in some of my bookmarks which point to data locations (at least in theory). Because of the number of requests that I've received, I though it might be easier to send the bookmarks out to the entire list. Please feel free to delete if your not interested. Here is a sampling from my netscape bookmark list: Education: http://www.census.gov/org/dusd/edu/about.html European Monthly Monitoring Survey: http://www.cec.lu/en/comm/dg10/infcom/epo/polls.html European Public Opinion: http://www.gallup.com/ Gallup Organization: gopher://icpsr.umich.edu/ Neilsen Media: http://www.nielsenmedia.com/ Nielsen Media Research - Interactive Services: gopher://burrow.cl.msu.edu/11/internet/msu/pda Political Data Archives - Michigan State: http://www.princeton.edu/~abelson/index.html Princeton Survey Research Center: http://www.ciesin.org/datasets/irss/irss.html Public Opinion Item Index - The Institute for Research in Social Science: http://ren.imagis.iupui.edu/pol/ Public Opinion Laboratory: http://www.golan.org.il/polls.html Public Opinion Polls about the Golan Heights: http://www.lib.uconn.edu/RoperCenter/ The Roper Center for Public Opinion Research: gopher://zonnetje.swidoc.nl/11/ Steinmetz Data Archives: http://politicsusa.com/PoliticsUSA/news/1106ip08.html.cgi Times Mirror Study -- PoliticsUSA: http://cansim.epas.utoronto.ca:5680/pwt/pwt.html PWT Database Welcome Page: http://ssda.anu.edu.au/ Social Science Data Archives: http://www.uark.edu/depts/comminfo/www/data.html- American Communication Association: gopher://www.polisci.nwu.edu:70/1 American Politics Gopher at North Western Univ.: http://gate1.dda.dk/dda.html Danish Data Archive: http://dpls.dacc.wisc.edu/ Data and Program Library Service Home Page<: http://www.swidoc.nl/star/staralg.html Dutch Social Science Data Archive (Steinmetz): http://www.keele.ac.uk/depts/po/election.htm Elections and Electoral Systems by Country: http://www.soc.qc.edu/ General Social Science Survey, CUNY: gopher://liberty.uc.wlu.edu/11/internet/hytelnet/sites2/ful000/ful021 Hebrew Univ. Social Science Data Archives: http://www.tarki.hu/index-e.html Hungarian Data Archive: http://icpsr.umich.edu/ICPSR_homepage.html ICPSR - Home Page: gopher://statlab.stat.yale.edu/11/Internet_Stats InterNET Resources for Social Science Statistics: http://cc-server9.massey.ac.nz/%7ENZSRDA/ New Zealand Social Science Research Data Archive: http://www.uib.no/nsd/ Norweign Social Science Data Service: http://www.nau.edu/~srl/ Social Research Lab., N.A.U: http://sosig.esrc.bris.ac.uk/Welcome.html Social Science Information Gateway: http://www.lib.virginia.edu/socsci/ Social Sciences Data Center: http://www.hsrc.ac.za/sada.html The South African Data Archive: http://ssdc.ucsd.edu/ssdc/socsci.html Searchable Catalogs: http://www.stat-usa.gov/stat-usa.textonly.html Internet: General Data Resources: http://www.lib.umich.edu/libhome/Documents.center/stats.html Statistical Resources on the Web<: http://WWW.StatCan.CA/ Statistics Canada - Statistique Canada: http://www.ssd.gu.se/enghome.html Univ. of Alberta Social Science Data Archives: http://ssdc.ucsd.edu/ Univ. of Calif. - San Diego Data Collection: gopher://gopher.lib.virginia.edu/11/socsci Univ. of Virginia Social Science Data Archives: gopher://statlab.stat.yale.edu/11/SSDA Yale Social Science Data Archives: http://www.census.gov/stat_abstract/ This is a brief sampling of my bookmarks and I hope that you will find them useful. Please remember to have fun "truckin" through these sites. Mike S. * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
  • on Teaching. Ward. (see other WEB sites).
  • =====================Joe Ward, 14 Mar 1996========sse Message-ID: <Pine.OSF.3.91.960313233554.30617D-100000@gaston.tenet.edu> From: Joe H Ward <joeward@tenet.edu> Subject: Re: Teaching Intro Statistics F:\ASATORON.94\FINALASA.TXT **** HANDOUT FOR "ADOPT-A-SCHOOL" SESSION, ASA TORONTO, 1994 *************** **************************************************************************** EMPOWERING HIGH SCHOOL STUDENTS TO EXPLOIT STATISTICAL MODELS AND SOFTWARE FOR RESEARCH PROJECTS Joe H. Ward, Jr, Health Careers High School, Laura J. Niland, MacArthur High School Joe H. Ward, Jr., 167 E. Arrowhead Dr, San Antonio, TX 78228 Key Words: Adopt-a-School, Linear Models, Computers Introduction Activities of the San Antonio Chapter of ASA involving K-12 students and teachers are presented. These include (1) the Texas Prefreshman Engineering Program (PREP) designed to encourage females and minorities to enter science and engineering careers, (2) Student & Teacher Collaborative Projects in Problem Solving Using Data Analysis, and (3) Statistics Projects at MacArthur High School. These experiences are designed to strengthen the statistics and computer skills (using BUSINESS MYSTAT) of students who are involved in independent research projects for science fairs and statistics project/poster contests. A "top-down" approach is used which emphasizes starting with meaningful research questions and introducing new concepts as the need arises. The conceptual framework involves the Big Four Ideas of (1) Prediction, (2) Uncertainty, (3) Modeling, and (4) Optimization. A General Linear Model approach is used, starting with mutually exclusive categorical models with least-squares solutions that yield "cell means". Then more complex models are developed to investigate interactions among variables. The major goal of the activities described below is to empower high school students (and their teachers) to make effective use of the combined power of a prediction model (regression, linear model) approach and computers in data analysis for practical research. Probability, statistics and computer topics are introduced when needed. This approach possesses several important advantages over the traditional sequence in introductory statistics instruction: -- Students will have less to learn, because many of the "standard" statistical analysis procedures developed before the availability of high-speed computers can be accomplished with fewer ideas. -- Students will have more power to solve new problems, since they will be able to specify new models for unique problems. -- Students will be able to solve more problems with less computational burden, since the use of statistical software packages allows for solutions to complex prediction problems. Background ..................... ============= ABOUT 200 LINES ARE CUT OUT HERE ============= ..................... Selected References American Association for the Advancement of Science. Science for All Americans. Washington, D.C.: AAAS, 1989. American Statistical Association. Guidelines for the Teaching of Statistics K-12 Mathematics Curriculum. Alexandria, VA: ASA, 1991. Burrill, G., and J. Burrill. (Eds.). Data analysis and Statistics Across the Curriculum. Reston, VA: National Council of Teachers of Mathematics, 1991. Corwin, R., and S.J. Russell. Used Numbers: Real Data in the Classroom. Palo Alto, CA: Dale Seymour Publications, 1990. Foerster, Paul A. Precalculus with Trigonometry: Functions and Applications. Menlo Park, CA: Addison-Wesley, 1986. Fountain, Robert L. and Joe H. Ward, Jr. Regression Models and Software Packages: Synthesizing Traditional Procedures in a One-semester Statistics Course. Presented at ASA Winter Conference at Louisville, KY, 1992. Hale, Robert L., and Jeffrey W. Steagall. Business MYSTAT Statistical Applications (DOS Edition). Cambridge, MA: Course Technology, Inc., 1990. Laughlin, Margaret A., H. Michael Hartoonian, and Norris M. Sanders. From Information to Decision Making: New Challenges for Effective Citizenship. Washington, D.C.: National Council for the Social Studies, 1989. Moore, David S., and George P. McCabe. Introduction to the Practice of Statistics, Second Edition, New York, NY: W.H. Freeman, 1993. (This book and supplementary materials accompany the Telecourse videotape series Against All Odds: Inside Statistics available from The Annenberg Project, 1-800-LEARNER. These 26, 30-minute tapes are excellent and are frequently shown on PBS.) National Council of Teachers of Mathematics. Curriculum and Evaluation Standards for School Mathematics. Reston, Va.: NCTM, 1989. Ward, Joe H., Jr., and Paul A. Foerster. Integrating Statistics into the Secondary Curriculum. Proceedings of the Third International Conference on Teaching Statistics. ISI Permanent Office, 428 Princes Beatrixlaan, PO Box 950, 2270 AZ Voorgburg, The Netherlands, 1991. Ward, Joe H., Jr., and Earl Jennings. Introduction to Linear Models. Englewood Cliffs, NJ: Prentice-Hall, 1973. Ward, Joe H., Jr. Problem Solving Through Data Analysis. San Antonio, TX: Texas Prefreshman Engineering Program (TexPREP), 1991. Quantitative Literacy Series Gnanadesikan, M., R.L. Scheaffer, and J. Swift. The Art and Techniques of simulation. Palo Alto, CA: Dale Seymour Publications, 1987. Landwehr, J.M., and A.E. Watkins, Exploring Data., Palo Alto, CA: Dale Seymour Publications, 1986. Landwehr, J.M., J. Swift, and A.E. Watkins, Exploring Surveys and Information from Samples. Palo Alto, CA: Dale Seymour Publications, 1987. Newman, C.M., T.E. Obremski, and R.L. Scheaffer, Exploring Probability. Palo Alto, CA: Dale Seymour Publications, 1987. * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Dictionaries of terms?
  • =====================James Ssemakul, 25 Jan 1996========ssc Message-ID: <960125154430.25830d16@ucrac1.ucr.edu> From: James Ssemakula <JAMES@UCRAC1.UCR.EDU> Subject: Re: Encyclopaedia of Statistics Terms F.H.C. Marriott 1990. A dictionary of statistical terms. Longman & John Wiley. or Freund's Dictionary/outline of basic statistics or Tietjen 1986 A topical dictionary of statistics or Brian Everitt 1995 The Cambridge dictionary of statistics in the medical sciences. * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
  • ... learn about Bayesian theory?
  • =====================Darren Wilkinson, 07 Sept 1995========ssm,sse From: D J Wilkinson <D.J.Wilkinson@durham.ac.uk> Subject: Re: Wanted: introduction to Bayesian probability Message-ID: <42m827$gvd@mercury.dur.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Klaus-Peter Schriefers (schriefe@x4u2) wrote: : 1.) What books and articles are recommended to get an overview : on the state of discussion on Bayesian probability theory? The book by Bernardo and Smith, "Bayesian Inference" gives an overview of much of Bayesian statistics, without getting very involved in philosophical issues. It has an excellent bibliography. : 2.) What newsgroups, WWW-, HTML- documents are there to the same end? The Durham statistics guide to stats resources: http://fourier.dur.ac.uk:8000/stats/other.html has a slight Bayesian bias. Also, E.T. Jaynes book is on the web: http://www.math.albany.edu:8008/JaynesBook.html This is well worth a look if you're a physicist. : 3.) What applications of Bayesian techniques to physics are known? See Jaynes book for a few examples. : 4.) What other schools of thought exist in the field of probability : theory? There are the frequentist and likelihood schools (say no more!). There are then various degrees of Bayesianism. Starting with the non-informative Bayesians (my choice of words :-) ), who use reference analysis and non-informative priors, then the max-ent people, then the genuine subjective Bayesians, such as Savage and Lindley and then the extreme subjectivists, such as Bruno de Finetti and Michael Goldstein. : 5.) Since I got contact to subjective probability by a book of : B.d.Finetti (Probability Theory) who has a special formalism : and interpretation I would like to know about reactions to : his approach. Most "real" Bayesians consider de Finetti's approach to be the most complete and compelling account of the subjectivists point of view. However, many practicing Bayesians find his approach too difficult to carry out in practice, and since they are most familiar with a probabalistic approach to statistics, prefer to continue using it. However, the work on Bayes linear methods tries to effectively operationalize the ideas of de Finetti. See: http://fourier.dur.ac.uk:8000/stats/bd/ for more details. -- Darren <a href=http://fourier.dur.ac.uk:8000/djw.html>Signature page</a> * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
  • What's a good random-number generator?
  • ==========================Herman Rubin, 26 Sep 1994======ssm,maa,sn From: hrubin@b.stat.purdue.edu (Herman Rubin) Newsgroups: sci.stat.math,comp.ai.alife,sci.nonlinear Subject: Re: Fast random number generator, binomial pdf Message-ID: <366fd4$jrt@b.stat.purdue.edu> In article <19940926100937.Patrick.Onghena@po.psy.kuleuven.ac.be>, Patrick Onghena <Patrick.Onghena@psy.kuleuven.ac.be> wrote: >In Article <361v2v$b2m@carbon.denver.colorado.edu> "jrothman@carbon.denver.colorado.edu (Jay Rothman)" says: >> Dr A. Kleczkowski (ak133@cus.cam.ac.uk) wrote: >> : I am looking for a fast (very fast), reliable random number generator, >> : preferably in C (I am using Borland C++ and standard SUN cc): >> : >> : <part of original message deleted> >> : >> 1> Simulations run by students in my simulation class using the BC++ RNG >> have exhibited anomalies due to the RNG (as one would expect from RNGs >> packaged in compilers) - for whatever reason the RNG in TurboPascal fared >> better. See Law & Kelton, Simulation Modeling and Analysis, McGraw-Hill, >> 2nd edition, page 454 for a RNG in C based on the FORTRAN code of Marse >> And Roberts (1983) that has tested well. >I also obtained good results with the RNG of Turbo Pascal. Although their >mixed congruential algorithm (multiplier 134,775,813, increment 1, and >modulus 2**32) has its weaknesses (like any congruential algorithm), it is >good enough for most applications and the implementation is fast. >Onghena, P. (1993). A theoretical and empirical comparison of mainframe, > microcomputer, and pocket calculator pseudorandom number generators. > Behavior Research Methods, Instruments, & Computers, 25, 384-395. All known reasonably fast algorithms are HIGHLY suspect. There are cryptographically strong procedures, but they are too expensive. Probably the best for production use, assuming enough memory and that predictable memory accesses are fast, is the word Tausworthe method. In this, one sets x[n] = x[n-j] OP x[n-k], where OP is full-word integer addition, or XOR, and j and k are appropriately chosen. For vector processors, both j and k should be large. And exampel would be to use j=460 and k=607; other Mersenne primes can be used for the larger one. These have been shown to be congruential generators for huge bases. It has been suggested that several of these with different Mersenne primes should be XORed; another possibility is to take physical random numbers, which need not be outstanding, and XOR them with the pseudo-random ones when used. But make sure the period of the stored physical random numbers is not too close to a small multiple of a power of 2. -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
  • Moments: definitions
  • =====================Rich Ulrich, 25 Aug 1995========ssm Subject: Re: Moments Message-ID: <41ldbc$4av@usenet.srv.cis.pitt.edu> Jacob Galley (gal2@kimbark.uchicago.edu) wrote: : This is what I think I know about moments: The first moment of a : population is the mean; the second moment is the variance; the : third moment is the skewness; and the fourth moment is the kurtosis. That is close - the simple series runs, Average of: X, X^2 (that is read, X-squared), X^3, X^4, X^5, ... etc., going as high as anybody dreams. The second CENTRAL moment is what is labeled the variance, i.e., from (X-mean)^2 . The skewness is 3, an odd power; and beyond that, geometrical interpretation gets fuzzy with kurtosis at 4, and the unnamed things that are higher. One place you might look in the indices of books on statistics is the `Method of moments' for estimating parameters. * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
  • Equations for the ? distribution? - Numerical Recipes, online. Normal.
  • ===================Lars Gregersen, 23 Oct 1995========ssm From: projlg@ktbar96.kt.dtu.dk (Lars Gregersen (sbj)) Subject: Re: Looking for src code in Numerical Recipes Message-ID: <46fnhn$18g@news.uni-c.dk> Numerical recipes is on the net, try: http://cfata2.harvard.edu/nr/nrhome.html The book can be downloaded in Postscript or Acrobat format. From this it should be possible to extract the source code. =====================Chuck Haas, 26 Mar 1996========sse Message-ID: <v02120d07ad7da924a7d1@[129.25.24.210]> From: haascn@dunx1.ocs.drexel.edu (chuck haas) Subject: Re: cdf of multivariate normal >I am searching for a good and fast algorithm to approximate the cdf of a >multivariate normal. Does there exist a algorithm that handles a 2-, 3--, 4- >and more-variate normal distribution? Or are there different algorithms for >each dimension? I am interested in references, but surely I like programs, >too. Some more recent references: JV Terza and U Welland, "A Comparison of Divariate Normal Algorithms", J Statis. Comput. Simul. 39:115-27 (1991) DG Divgi, "Calculation of Univariate and Bivariate Normal Probability Functions", Annals of Statistics, 7:4:903-10 (1979) W Albers and WCM Kallenberg, "A Simple Approximation to the Bivariate Normal Distribution with Large Correlation Coefficient", Journal of Multivariate Analysis 49:87-96 (1994) DR Cox and N Wermuth, "A Simple Approximation for Bivariate and Trivariate Normal Integrals", Int. Stat. Rev. 59:2:263-9 (1991) Z Drezner and GO Wesolowsky, On the Computation of the Bivariate Normal Integral, J. Statist. Comput. Simul. 35:101-7 )1990) I have tried to implement some of these algorithms (in MATLAB) and they are fairly tricky. Good luck. --- Charles N. Haas * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
  • Using harmonic/geometric means?
  • ====================Aaron Brown, 15 Aug 1996=======ssm From: aacbrown@aol.com (AaCBrown) Message-ID: <4uvaf7$mk8@newsbf02.news.aol.com> >> Under what conditions should the harmonic and geometric means be used? The simplest answer for the geometric mean is when multipling data makes more sense than adding it. For example if I make $50 today and lose $40 tomorrow I made an average of $5 per day. It makes sense to add dollars. But if my mutual fund makes +50% this year and -40% next year it does not make sense to add these numbers. I did not get a +10% return over the two years but a -10% return. In this case the geometric mean of the wealth ratios (1.5 and 0.6) is a more useful measure than the mean. Similarly a harmonic mean makes sense when the inverse of the data is the relevant variable. If I drive 100 kph for half a trip and 25 kph for the other half my average speed is 62.5 kph (arithmetic mean) if "half" means half the time but 40 kph (harmonic mean) if "half" means half the distance. If there is a big difference between the different means then you likely have one or more "outliers" that are very different from the other values. In this case it often makes more sense to report a mean (of either sort) for the bulk of the data and note the existence of the outliers. There may be no single number that adequately represents the data. Aaron C. Brown New York, NY * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
  • ML Maximum likelihood vs other statistics
  • =====================Rich Ulrich, 25 Oct 1995========ssc From: wpilib+@pitt.edu (Richard F Ulrich) Subject: Re: Was: How to normalize data? Message-ID: <46m0jk$k9i@usenet.srv.cis.pitt.edu> Bill Simpson (wsimpson@uwinnipeg.ca) wrote: : Why would anyone want to do a transformation these days? : E.g. fitting a simple y=f(x)+error model. Why not just find a reasonable : model form for f(x) and a reasonable distribution for the error, and then : fit the model by maximum likelihood? : It seems to me that transformations were useful in the days before : easy maximum likelihood computation. Those days are over. << original question, revised answer ... >> Are you asking, "Why do people still use Ordinary Least Squares (OLS) analyses when maximum likelihood computations could avoid, say, the need for transformation beforehand?" Here is how I proceed with problematic data: I look at the same time for a reasonable transformation, which is one that makes sense, considering where the data come from, especially in the regard of leaving a reasonable distribution for the error. A reason for doing this is that is should allow for a direct, OLS solution, which gives all those useful, subsidiary statistics - Means for groups; correlations among variables. These things only make decent sense, after my choice of metric (transformations) has gotten rid of really extreme outliers. When I know what my metric is, I can look at a scatter-plot (say) and see that there are still some outliers - so maybe my model is INVALID if I don't get rid of them. If I just plug my numbers into the statistical model and pray, then who needs a statistican? So, what do you have in mind? I do not look forward to "easy maximum likelihood computation" that replaces the examination of data. Where MLE provides a better STATEMENT of the problem, then MLE solutions should be pursued. But a simple transformation, with OLS solution, seems to me to be preferable to a direct MLE solution that merely buries the transformation in unintelligible statistics (and computerized computations that STILL may take 10 or 100 times as long). * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
  • Why Pearson's test, for contingency table?
  • ======================Rich Ulrich, 04 Jun 1996=======sse From: wpilib+@pitt.edu (Richard F Ulrich) Subject: Re: Goodness-of-fit statistics Message-ID: <4p1hra$fqj@usenet.srv.cis.pitt.edu> Rich Strauss (y8res@ttacs1.ttu.edu) wrote: Strauss asked about Pearson's chi-square statistic for goodness of fit, : (1) Is there any rationale, other than historical convenience, for using the : particular weighting scheme of Pearson's statistic? Let me recommend to you an article from a few months ago, "A single general method for the analysis of cross classified data: Reconciling [...]", Leo Goodman, JASA 96:[my ref. here is mangled. Next it says 443:408-428]. This discusses not only Pearson's test; and the maximum likelihood log-linear test which is fairly common; and also Yule's test, and the whole family. The family was also mentioned in Agresti's _Categorical data analysis_ , which cites Cressie and Read for introducing it as "power divergence" statistics. * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
  • ... know where to round-off numbers?
  • ==========================Bob Wheeler, 31 May 1996=======ssc Subject: Re: rounding error Message-ID: <31AF6062.531A@echip.com> > I am working some county population estimates based on state totals and > have found that I have rounding error. For example, in a county I have > 326.588 people, and I can't very well have .588 people. If I simply > delete the .588 people and the other fractions, my state population will > be off. Does anyone know of any easy methods I could use to adjust the > numbers without changing the state total. -- Optimal (or efficient) rounding has been studied, and is used on problems like yours. I don't have a direct reference, but look in CIS for papers by Friedrich Pukelshiem. He is concerned with optimal experimental design, in pursuit of which, he in one of his papers cites the literature that will interest you. It is possible that you can find it by searching CIS for "efficient rounding," but this may be Pukelsheim's coinage. Bob Wheeler, ECHIP, Inc. From: knight@unb.ca Date: Thu, 30 May 1996 15:25:33 GMT Message-ID: <knight.26.31ADBDED@unb.ca> Unkind question: What is the standard error? Rounding to 326 or 327 suggests accuracy in the last digit. If the standard errors be more than 10, is this honest or should one round to 320 or 330 ? I.e., should the problem to be, perhaps, rather than how to round the .588, how to round the * 6*.588 ? bill knight / university of new brunswick / canada * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
  • Document by Rich Ulrich. E-mail to wpilib+@pitt.edu
  • FAQ top.
  • Ulrich home page.
  • Ulrich FAQ. http://www.pitt.edu/~wpilib/stats99.html