C C*********************************************************************** C C CLUSTER DATA GENERATOR PROGRAM C C PROGRAMMED BY GLENN W. MILLIGAN IN FORTRAN IV. C THIS PROGRAM HAS EXECUTED SUCCESSFULLY ON A NUMBER OF IBM C AND AMDAHL MAIN-FRAME MACHINES. IT HAS ALSO EXECUTED ON A C PRIME COMPUTER. C REVISED FALL 1984. C C This is the PC Version of CLUSTER DATA GENERATION PROGRAM C modified by F.J.Carmone,Jr. 3/89 C C DATA OUTPUT OF THE GENERATOR IS PLACED IN FILES 1 THROUGH 11 (ONE FILE C FOR EACH OF THE 11 TYPES OF MATRICES). C PRINTED OUTPUT OF THE GENERATOR IS PLACED IN FILE 12. C C THE FIRST LINE OF EACH SET CONSISTS OF PRELIMINARY C INFORMATION AND INCLUDES: C C SET (SEQUENTIAL NUMBER) C TYPE (ERROR CONDITION TYPE -DESCRIBED BELOW) C NPOINT (NUMBER OF POINTS IN THE DATA SET) C DIMNR (INTERNAL DO-LOOP INDICATOR FOR DIMENSIONS) C TDIMNR (NUMBER OF DIMENSIONS FOR THE DATA SET) C CLUSNR (INTERNAL DO-LOOP INDICATOR FOR # OF CLUSTERS) C TCLUSN (NUMBER OF CLUSTERS IN DATA SET) C DENSIT (DENSITY LEVEL - DISTRIBUTION PATTERNS OF POINTS C TO THE CLUSTERS) C SAVENB(10) (INDICATES CLUSTER MEMBERSHIP OF THE POINTS C TO THE CLUSTERS. POINTS ARE NUMBERED SEQUENTIALLY C WITHIN CLUSTER) C C THE FORMAT IS (18I3). C C THIS VERSION OF THE GENERATOR WRITES OUT THE COORDINATE VALUES. C THE FORMAT IS (11F8.3). (ON IBM EQUIPMENT, ONE MAY WANT TO CHANGE THIS C TO (10F8.3,/,F8.3). C C C THE GENERATOR IS DESCRIBED AND REFERENCED IN: C C "AN EXAMINATION OF THE EFFECT OF SIX TYPES OF ERROR PERTURBATION C ON FIFTEEN CLUSTERING ALGORITHMS", PSYCHOMETRIKA, 1980, 325-342. C C "A MONTE CARLO STUDY OF THIRTY INTERNAL CRITERION MEASURES FOR C CLUSTER ANALYSIS", PSYCHOMETRIKA, 1981, 187-199. C C "THE EFFECT OF CLUSTER SIZE, DIMENSIONALITY, AND THE NUMBER OF C CLUSTERS ON RECOVERY OF TRUE CLUSTER STRUCTURE", IEEE TRANSACTIONS C ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1983, 40-47. C C "AN EXAMINATION OF PROCEDURES FOR DETERMINING THE NUMBER OF CLUSTERS C IN A DATA SET", PSYCHOMETRIKA, 1985, IN PRESS. C C "AN ALGORITHM FOR GENERATING ARTIFICIAL TEST CLUSTERS", UNDER REVIEW. C C C DESCRIPTION OF THE ALGORITHM: C C EACH PASS THROUGH THE PROGRAM GENERATES UP TO 11 DATA SETS WHICH CORRESPOND C TO THE 11 ERROR CONDITIONS PRESENTED IN THE 1980 PSYCHOMETRIKA PAPER. C C THESE ERROR CONDITIONS ARE: C C TYPE 1: ERROR FREE DATA - STRONG AND DISTINCT CLUSTERING PRESENT. C CLUSTERS ARE KNOWN TO BE NON-OVERLAPPING BECAUSE PROGRAM C CREATES CLUSTERS ON THE FIRST DIMENSION WHICH DO NOT C OVERLAP. OVERLAP IS PERMITTED ON THE REMAINING DIMENSIONS. C OVERLAP OF CLUSTER BOUNDARIES IS LIKELY HERE, BUT NOT C REQUIRED. C C TYPE 2: 20% ADDED OUTLIER POINTS. C *** BY DEFAULT, THE OUTLIERS FOR TYPE 2 AND TYPE 3 C ARE PLACED TOGETHER IN ONE DATA SET AND ARE PRINTED C WITH THE TYPE 2 SET. POINTS 51-60 ARE THE 20% SET C AND POINTS 61-80 ARE THE 40% SET. THE BASIC SET C (POINTS 1-50) ARE THE SAME AS TYPE 1. THIS CAN C BE SUPPRESSED BY SETTING ICHK=0 IN THE PROGRAM, C ONE WILL THEN GET TYPE 2 AND TYPE 3 SETS COMPLETE C AND SEPARATE. C C TYPE 3: 40% ADDED OUTLIER POINTS. C C TYPE 4: ERROR PERTURBED COORDINATES (LOW LEVEL). C C TYPE 5: ERROR PERTURBED COORDINATES (HIGH LEVEL). C C TYPE 6: ADDED RANDOM NOISE DIMENSION. C *** BY DEFAULT, THE RANDOM NOISE DIMENSION HAS BEEN C APPENDED TO THOSE IN TYPE 7. THE TRUE DIMENSIONS ARE C THE SAME FOR BOTH TYPE 6 AND 7. TO HAVE TYPE 6 C PRINTED, SET ICHK=0 IN THE PROGRAM BELOW. C C TYPE 7: 2 ADDED RANDOM NOISE DIMENSIONS. C A THIRD DIMENSION HAS BEEN APPENDED TO THIS SET WHICH C IS THE RANDOM NOISE DIMENSION FROM TYPE 6. THIS IS C SUPPRESSED IF TYPE 6 IS PRINTED. C C TYPE 8: TYPE 1 DATA - ORIGINALLY USED BY THE CLUSTERING C ALGORITHMS TO COMPUTE A PEARSON CORRELATION BASED C DISTANCE MEASURE. C *** BY DEFAULT, PRINTING OF THIS REDUNDANT DATA SET C IS SUPPRESSED. TO PRINT THE SET, SET ICHK=0 IN C PROGRAM. C C TYPE 9: STANDARDIZED COORDINATES (MEAN=0, VARIANCE=1). C C TYPE 10: TYPE 9 DATA - ORIGINALLY USED BY THE CLUSTERING C ALGORITHMS TO COMPUTE CATTEL'S R-sub-p DISTANCE C MEASURE. C *** BY DEFAULT, PRINTING OF THIS REDUNDANT DATA SET C IS SUPPRESSED. TO PRINT THE SET, SET ICHK=0 IN C PROGRAM. C C TYPE 11: RANDOM NOISE DATA (NO CLUSTER STRUCTURE). C C C 108 SETS ARE CREATED CONSISTING OF 7 (OR UP TO 11) MATRICES PER SET. C C C INFORMATION CONCERNING THE RANDOM NUMBER GENERATORS IS CONTAINED C IN THE COMMENTS IN THESE SUBROUTINES. C C FOR TESTING PURPOSES, OUTPUT FROM THE ROUTINES IS PRESENTED HERE. C IF THE SEED 91419377 IS USED TO START THE RANDOM (UNIFORM) C SUBROUTINE, THEN THE FIRST 50 RANDOM NUMBERS ARE: C C 0.79383 0.08065 0.44663 0.34475 0.17408 C 0.87383 0.45221 0.79664 0.99726 0.49148 C 0.36483 0.30704 0.45858 0.81845 0.83512 C 0.06002 0.20356 0.22399 0.21721 0.41222 C 0.12568 0.79204 0.12266 0.24396 0.24851 C 0.78879 0.88596 0.92550 0.55723 0.86543 C 0.66612 0.56830 0.73892 0.71290 0.75272 C 0.19999 0.38401 0.60427 0.98854 0.97052 C 0.29556 0.47895 0.06460 0.46550 0.14661 C 0.94183 0.78732 0.67191 0.81946 0.76911 C C FOR THE NORMAL GENERATOR, IF THE SEED 91419377 IS USED, THE FOLLOWING C 50 NUMBERS ARE OBTAINED: C C 0.59414 0.32981 -0.71200 1.05123 1.31245 C -1.33190 0.36395 -1.20613 -0.07394 0.00396 C -0.49812 1.32986 0.52066 -1.13496 0.55811 C 0.22105 0.29031 1.76049 -1.48837 0.91572 C 0.53171 -1.96603 0.07774 2.04712 0.40269 C -1.61939 0.43917 -0.22203 0.71736 -0.80929 C -0.81969 -0.37507 -0.17970 -0.75686 0.23294 C 0.71684 -1.09707 -0.84298 0.14926 -0.02797 C -1.54770 0.20586 -2.28599 0.50338 1.83014 C -0.70040 -0.32587 -0.60995 0.07558 -0.62650 C C C FOR PURPOSES OF PROGRAM MODIFICATION, THE FOLLOWING INFORMATION IS C PROVIDED. C C NUMBER OF REPLICATIONS: THIS WILL BE EASY TO ADJUST BY CHANGING C THE UPPER LIMIT OF THE FIRST DO-LOOP (DO 1 REPS=1,3). C C NUMBER OF DATA POINTS: THIS WILL REQUIRE SOME WORK. FIRST, MANY C ARRAY SIZES WILL HAVE TO BE CHANGED. THE DENSTY AND OUTLIR ARRAYS C WILL NEED ADJUSTMENT, AS WILL THE POINTERS TO THESE ARRAYS IN C THE PROGRAM. FINALLY, SEVERAL ADJUSTMENTS IN THE PROGRAM CODE C WILL BE REQUIRED. C ONE SHOULD BE ABLE TO DO THIS. ADJUSTMENT OF THIS FACTOR C MIGHT GIVEN INTERESTING RESULTS. C C NUMBER OF DIMENSIONS: LESS DIFFICULT THAN THE NUMBER OF POINTS, C BUT PROBABLY LESS INTERESTING AS WELL. INCREASING THE NUMBER C OF DIMENSIONS SHOULD ONLY MAKE IT EASIER FOR THE ALGORITHMS C TO FIND THE TRUE STRUCTURE. THIS IS A RESULTS OF THE PARTICULAR C CLUSTER CONSTRUCTION ALGORITHM USED. C C DISTRIBUTION PATTERN OF THE POINTS TO THE CLUSTERS (DENSITY FACTOR): C THIS FACTOR SHOULD NOT BE TOO DIFFICULT TO ADJUST BY ITSELF. C ONE MUST CHANGE THE VALUES IN THE DENSTY ARRAY. OTHER MINOR C CHANGES MIGHT BE REQUIRED. C C NUMBER OF CLUSTERS: CURRENTLY THIS IS 2 TO 5. THIS WOULD BE DIFFICULT C TO CHANGE. FOR MORE CLUSTERS, ONE ALSO WOULD PROBABLY WANT A C LARGER NUMBER OF POINTS. CERTAINLY, C ARRAY SIZES WOULD CHANGE AS WOULD THE VALUES IN DENSTY AND OUTLIR C ARRAYS. OTHER CHANGES IN THE PROGRAM CODE WOULD BE REQUIRED. C C TRUNCATED MIXTURES COULD BE CHANGED TO UNCONSTRAINED MIXTURES FAIRLY C EASILY. C C PROBABILITY DISTRIBUTION FOR THE POINTS OR ERROR VALUES: THIS COULD C BE CHANGED FAIRLY EASILY BY ADDING OTHER RANDOM VARIATE GENERATION C SUBROUTINES AND MODIFYING THE APPROPRIATE CALLS IN THE MAIN C PROGRAM. ALTERNATIVE DISTRIBUTIONS MIGHT PRODUCE INTERESTING C RESULTS. C C OTHER MINOR CHANGES SUCH AS ADJUSTING PROGRAM CONSTANTS SHOULD BE C EASY TO EFFECT. C C*********************************************************************** C C C MAIN PROGRAM C DIMENSION COORD(320,11), 1 BOUNDS(2,5,8), 2 STDDEV(5,8), 3 ERROR(200,8), 4 LIST(200), 5 NOUT(5), 6 N(5) C INTEGER SET, 1 CLUSNR, 2 TCLUSN, 3 DIMNR, 4 TDIMNR, 5 DENSIT, 6 REPS, 7 POINT, 8 DENSTY(42), 9 OUTLIR(42), A TEMP2, B CLUST(5,8), C TDIMN1, D TCLUS1, E SAVENB(10), F TYPE, G TP1, H TP2, I OLDNUM, KPRNT(11), TP3 C REAL MEAN(5,8), 1 LENGTH(5,8), 2 LOWER, 3 NORML C REAL*8 SEED CHARACTER*80 TITLE C COMMON / RAND/UNI(9800) COMMON / RAND2/NORML(9800) C C THE DENSTY VECTOR INDICATES THE NUMBER OF POINTS PER CLUSTER C DATA DENSTY /25,25,16,17,17,12,12,13,13,10,10,10,10,10,5,45,5,22, 1 23,5,15,15,15,5,11,11,11,12,30,20,30,10,10,30,6,7,7,30,5,5,5,5/ C C THE OUTLIR VECTOR INDICATES THE NUMBER OF OUTLYING POINTS C ASSIGNED TO EACH CLUSTER IN THE DENSTY VECTOR C DATA OUTLIR /15,15,9,9,12,6,6,9,9,6,6,6,6,6,3,27,3,12,15,3,9,9,9, 1 3,6,6,6,9,18,12,18,6,6,18,3,3,6,18,3,3,3,3/ C OPEN(10,FILE='OUT.S01',STATUS='UNKNOWN') OPEN(20,FILE='OUT.S02',STATUS='UNKNOWN') OPEN(30,FILE='OUT.S03',STATUS='UNKNOWN') OPEN(40,FILE='OUT.S04',STATUS='UNKNOWN') OPEN(50,FILE='OUT.S05',STATUS='UNKNOWN') OPEN(60,FILE='OUT.S06',STATUS='UNKNOWN') OPEN(70,FILE='OUT.S07',STATUS='UNKNOWN') OPEN(80,FILE='OUT.S08',STATUS='UNKNOWN') OPEN(90,FILE='OUT.S09',STATUS='UNKNOWN') OPEN(100,FILE='OUT.S10',STATUS='UNKNOWN') OPEN(110,FILE='OUT.S11',STATUS='UNKNOWN') OPEN(120,FILE='OUT.SUT',STATUS='UNKNOWN') WRITE(*,9090) WRITE(120,9090) 9090 FORMAT(9(/),1X,78('*')/5(1X,'*',T79,'*'/), 1 1X,'*',T27,'C - L - U - S - T - G - E - N', T79,'*'/ 2 1X,'*', T79,'*'/ 3 1X,'*', T79,'*'/ 4 1X,'*',T27,' by: ',T79,'*'/ 5 1X,'*',T27,' GLENN MILLIGAN ',T79,'*'/ 6 1X,'*',T27,' OHIO STATE UNIVERSITY ',T79,'*'/ 7 1X,'*', T79,'*'/ 8 1X,'*',T27,' PC Version by: ',T79,'*'/ 9 1X,'*',T27,' F.J.CARMONE,JR ',T79,'*'/ A 1X,'*',T27,' DREXEL UNIVERSITY ',T79,'*'/ B 5(1X,'*',T79,'*'/), C 1X,78('*')/) WRITE(*,9095) 9095 FORMAT(' TYPE IN THE TITLE FOR THIS RUN ') READ(*,'(A)') TITLE WRITE(*,219) TITLE WRITE(120,219) TITLE 219 FORMAT(//' TITLE = ',A) C C READ IN PARAMETERS C C KREPS = NO. OF REPLICATIONS C KCLUS = MAX. NO. OF CLUSTERS C KDIMN = MAX. NO. OF DIMENSIONS C KSEED = SEED FOR RANDNUM NO. GENERATOR C KPRNT(I) = IF GT 0, PRINT THE iTH ERROR TYPE CLUSTERS C KPNTS = NO. OF POINTS (50,100,150,200) C WRITE(*,9096) 9096 FORMAT(//' TYPE IN THE FOLLOWING PARAMETERS'/ * ' NO. OF REPLICATIONS --> '\I5) READ(*,*) KREPS WRITE(*,9097) 9097 FORMAT( ' MAX. NO. OF CLUSTERS (2,3,4,5) --> '\I5) READ(*,*) KCLUS WRITE(*,9098) 9098 FORMAT( ' MAX. DIMENSIONS (1=4 dim, 2=6, 3=8) --> '\I5) READ(*,*) KDIMN WRITE(*,9099) 9099 FORMAT( ' SEED FOR RANDOM GENERATOR (0=91419377)--> '\I8) READ(*,*) KSEED WRITE(*,9199) 9199 FORMAT( ' NO. OF POINTS (50,100,150,200) --> '\I5) READ(*,*) KPNTS WRITE(*,9100) 9100 FORMAT( ' OUTPUT CONTROL FOR 11 ERROR TYPES - USE:'/ * ' 0 = SUPPRESS OUTPUT, 1 = PRINT CLUSTERS TO FILE --> '/) DO 9101 KIPR=1,11 WRITE(*,9102) KIPR 9102 FORMAT(' ERROR TYPE ,'\I2,': ') READ(*,*) KPRNT(KIPR) 9101 CONTINUE SEED = KSEED IF(KSEED .EQ. 0) SEED=91419377 WRITE(120,202) KREPS, KCLUS, KDIMN, SEED, KPNTS, KPRNT WRITE(*,202) KREPS, KCLUS, KDIMN, SEED, KPNTS, KPRNT 202 FORMAT(' PARAMETERS FOR THIS RUN '/ * ' NO. OF REPLICATIONS.........',I5/ * ' MAX. NO. OF CLUSTERS........',I5/ * ' MAX. DIMENSION SET..........',I5/ * ' SEED FOR RANDNO GENERATOR...',F12.1/ * ' NUMBER OF POINTS............',I5/ * ' PRINT CONTROL...............',11I2/) C C KMULT = KPNTS / 50 C SET=0 C C MASTER CONTROL DO-LOOPS: C REPLICATIONS C NUMBER OF CLUSTERS C NUMBER OF DIMENSIONS C DENSITY LEVELS (DISTRIBUTION OF POINTS TO CLUSTERS) C KCLUS = KCLUS - 1 C DO 1 REPS=1,KREPS DO 2 CLUSNR=1,KCLUS TCLUSN=CLUSNR+1 TCLUS1=TCLUSN-1 ITT=2*TCLUSN DO 3 DIMNR=1,KDIMN WRITE(*,9200) REPS, CLUSNR, DIMNR 9200 FORMAT(' REPS, CLUSNR, DIMR', 3I6) TDIMNR=DIMNR*2+2 TDIMN1=TDIMNR-1 TP1=TDIMNR+1 TP2=TP1+1 DO 4 DENSIT=1,3 SET=SET+1 C C GENERATE CLUSTER BOUNDARIES C NBOUND=TDIMNR*TCLUSN*2 CALL RANDOM(SEED,NBOUND) ICNT=0 DO 5 IDIM=1,TDIMNR C C COMPUTE CLUSTER LENGTHS C DO 6 ICLUS=1,TCLUSN ICNT=ICNT+1 LENGTH(ICLUS,IDIM) = 10.0 + 30.0 * UNI(ICNT) CLUST(ICLUS,IDIM) = ICLUS 6 CONTINUE 5 CONTINUE C C COMPUTE ORDER OF CLUSTERS, FIRST SORT THE RANDOM NUMBERS C DO 7 J=1,TDIMNR DO 8 I=1,TCLUSN ICNT=ICNT+1 UNI(I)=UNI(ICNT) 8 CONTINUE DO 9 K=1,TCLUS1 K1=K+1 DO 10 L=K1,TCLUSN IF(UNI(L).GT.UNI(K)) GO TO 10 TEMP1=UNI(L) UNI(L)=UNI(K) UNI(K)=TEMP1 TEMP2=CLUST(L,J) CLUST(L,J)=CLUST(K,J) CLUST(K,J)=TEMP2 10 CONTINUE 9 CONTINUE 7 CONTINUE NBOUND=(TCLUSN-1)*TDIMNR+1 C CALL RANDOM(SEED,NBOUND) C ICNT=1 C C COMPUTE CLUSTER BOUNDARIES FOR DIMENSION 1 - OVERLAP IS NOT C PERMITTED. ALSO COMPUTE CENTROID AND STD. DEV. C LOWER=0. DO 12 I=1,TCLUSN UPPER=LOWER+LENGTH(I,1) I1=CLUST(I,1) BOUNDS(1,I1,1)=LOWER BOUNDS(2,I1,1)=UPPER MEAN(I1,1)=(UPPER-LOWER)/2. + LOWER STDDEV(I1,1)=LENGTH(I,1)/3. IF(I.EQ.TCLUSN) GO TO 13 ADD=(STDDEV(I1,1)+LENGTH(I+1,1)/3.)*(.25+.5*UNI(ICNT)) LOWER=UPPER+ADD ICNT=ICNT+1 12 CONTINUE 13 SAVEBD=UPPER*.67 SAVE=UPPER C C COMPUTE CLUSTER BOUNDARIES FOR THE REMAINING DIMENSIONS - OVERLAP IS C PERMITTED. COMPUTE CENTROID AND STD. DEV. C DO 11 J=2,TDIMNR DO 14 I=1,TCLUSN LOWER=UNI(ICNT)*SAVEBD ICNT=ICNT+1 I1=CLUST(I,J) UPPER=LOWER+LENGTH(I,J) BOUNDS(1,I1,J)=LOWER BOUNDS(2,I1,J)=UPPER MEAN(I1,J)=(UPPER-LOWER)/2. + LOWER STDDEV(I1,J)=LENGTH(I,J)/3. 14 CONTINUE 11 CONTINUE C C SPECIFY DENSITY WITHIN CLUSTER AND OUTLIERS C POINT=1 IF(DENSIT.EQ.2) POINT=15 IF(DENSIT.EQ.3) POINT=29 IF(TCLUSN.EQ.3) POINT=POINT+2 IF(TCLUSN.EQ.4) POINT=POINT+5 IF(TCLUSN.EQ.5) POINT=POINT+9 DO 15 I=1,TCLUSN C C NOTE CHANGE TO ACCOMODATE THE CHANGE IN THE NUMBER OF POINTS C WRITTEN FOR 50 POINTS - * KMUT FOR MULTIPLES OF 50 POINTS N(I)=DENSTY(POINT) * KMULT NOUT(I)=OUTLIR(POINT) * KMULT POINT=POINT+1 15 CONTINUE C C GENERATE WITHIN CLUSTER DATA POINTS C NUM=0 DO 16 J=1,TCLUSN ICNT=1 OLDNUM=NUM+1 NUM=N(J)+NUM NUMBR=N(J)*TDIMNR C CALL NORMAL(SEED,NUMBR,1) C JJ=J*2 JJ1=JJ-1 SAVENB(JJ1)=OLDNUM SAVENB(JJ)=NUM DO 39 K=OLDNUM,NUM LIST(K)=J 39 CONTINUE DO 17 I=1,TDIMNR DO 18 K=OLDNUM,NUM 20 COORD(K,I)=MEAN(J,I) + NORML(ICNT)*STDDEV(J,I) IF(COORD(K,I).LT.BOUNDS(1,J,I)) GO TO 19 IF(COORD(K,I).GT.BOUNDS(2,J,I)) GO TO 19 ICNT=ICNT+1 GO TO 18 C 19 CALL NORMAL(SEED,1,ICNT) C GO TO 20 18 CONTINUE 17 CONTINUE 16 CONTINUE C C GENERATE OUTLIERS C IGM=1 NUM=KPNTS 94 ICNT=1 DO 21 J=1,TCLUSN OLDNUM=NUM+1 NUM=NOUT(J)*IGM/3+NUM NUMBR=NOUT(J)*TDIMNR*IGM/3 C CALL NORMAL(SEED,NUMBR,1) C DO 22 K=OLDNUM,NUM 93 DO 23 I=1,TDIMNR ICNT=ICNT+1 COORD(K,I)=MEAN(J,I)+NORML(ICNT) * STDDEV(J,I)*3.0 23 CONTINUE DO 24 I=1,TCLUSN DO 25 L=1,TDIMNR IF((COORD(K,L).LT.BOUNDS(1,I,L)).OR. * (COORD(K,L).GT.BOUNDS(2,I,L))) GO TO 24 25 CONTINUE ICNT=ICNT-TDIMNR C CALL NORMAL(SEED,TDIMNR,ICNT) C GO TO 93 24 CONTINUE 22 CONTINUE 21 CONTINUE IF(IGM.EQ.2) GO TO 95 IGM=2 GO TO 94 C C WRITE TYPE 1 C 95 CONTINUE ICHK = KPRNT(1) IF(ICHK .EQ. 0) GO TO 801 NPOINT=KPNTS TYPE=1 WRITE(10,29) SET,TYPE,NPOINT,DIMNR, 1 TDIMNR,CLUSNR,TCLUSN,DENSIT, 2 (SAVENB(I),I=1,ITT) DO 30 I=1,KPNTS WRITE(10,31) I, (COORD(I,K),K=1,TDIMNR) 30 CONTINUE 801 CONTINUE C C WRITE TYPE 2 C ICHK=KPRNT(2) IF(ICHK .EQ. 0) GO TO 802 NPOINT=KPNTS + (0.20 * KPNTS) TYPE=2 WRITE(20,29) SET,TYPE,NPOINT,DIMNR, 1 TDIMNR,CLUSNR,TCLUSN,DENSIT, 2 (SAVENB(I),I=1,ITT) DO 96 I=1,NPOINT WRITE(20,31) I, (COORD(I,K),K=1,TDIMNR) 96 CONTINUE 802 CONTINUE C C WRITE TYPE 3 C ICHK = KPRNT(3) IF(ICHK .EQ. 0) GO TO 244 NPOINT=KPNTS + (0.40 * KPNTS) TYPE=3 WRITE(30,29) SET,TYPE,NPOINT,DIMNR, 1 TDIMNR,CLUSNR,TCLUSN,DENSIT, 2 (SAVENB(I),I=1,ITT) DO 38 I=1,KPNTS WRITE(30,31) I, (COORD(I,K),K=1,TDIMNR) 38 CONTINUE KPNTS1 = KPNTS + 1 DO 138 I=KPNTS1,NPOINT WRITE(30,31) I, (COORD(I,K),K=1,TDIMNR) 138 CONTINUE C C COMPUTE COORDINATE MATRIX FOR LOW/HIGH ERROR C 244 CONTINUE ICHK = KPRNT(4) IF(ICHK .EQ. 0) GO TO 45 IREP=1 GAMMA=1.0 TYPE=4 NPOINT=KPNTS ICNT=0 IP = 4 40 CONTINUE NUMBR=1225*TDIMNR C CALL NORMAL(SEED,NUMBR,1) C DO 41 I=1,KPNTS DO 42 K=1,TDIMNR ICNT=ICNT+1 ERROR(I,K)=COORD(I,K)+GAMMA*NORML(ICNT)*STDDEV(LIST(I),K)/2. 42 CONTINUE 41 CONTINUE C C WRITE TYPE 4 AND TYPE 5 C IPXX=IP*10 WRITE(IPXX,29) SET,TYPE,NPOINT,DIMNR, 1 TDIMNR,CLUSNR,TCLUSN,DENSIT, 2 (SAVENB(I),I=1,ITT) DO 44 I=1,KPNTS WRITE(IPXX,31) I, (ERROR(I,K),K=1,TDIMNR) 44 CONTINUE 45 CONTINUE C C LOOP BACK TO COMPUTE HIGH ERROR SET (TYPE 5) C IF(IREP.EQ.2) GO TO 47 ICHK = KPRNT(5) IF(ICHK .EQ. 0) GO TO 47 IREP=2 GAMMA=2. TYPE=5 ICNT=0 IP = 5 GO TO 40 C C GENERATE DATA FOR 1 DIMENSION RANDOM ERROR C 47 CONTINUE KVAL = KPNTS * 3 CALL RANDOM(SEED,KVAL) C DO 48 I=1,KPNTS COORD(I,TP1)=UNI(I)*SAVE COORD(I,TP2+1)=COORD(I,TP1) 48 CONTINUE C C WRITE TYPE 6 C ICHK=KPRNT(6) IF(ICHK .EQ. 0) GO TO 245 TYPE=6 WRITE(60,29) SET,TYPE,NPOINT,DIMNR, 1 TDIMNR,CLUSNR,TCLUSN,DENSIT, 2 (SAVENB(I),I=1,ITT) DO 52 I=1,KPNTS WRITE(60,31) I, (COORD(I,K),K=1,TP1) 52 CONTINUE C C GENERATE DATA FOR 2 DIMENSION RANDOM ERROR C 245 CONTINUE KP1 = KPNTS + 1 K2 = KPNTS * 2 DO 55 I=KP1,K2 KM50 = I - KPNTS COORD(KM50,TP1)=UNI(I)*SAVE KP50 = I + KPNTS COORD(KM50,TP2)=UNI(KP50)*SAVE 55 CONTINUE C C WRITE TYPE 7 C ICHK = KPRNT(7) IF(ICHK .EQ. 0) GO TO 807 TYPE=7 WRITE(70,29) SET,TYPE,NPOINT,DIMNR, 1 TDIMNR,CLUSNR,TCLUSN,DENSIT, 2 (SAVENB(I),I=1,ITT) C C CHECK THIS STATEMENT C TP3=TP2+1 C DO 59 I=1,KPNTS WRITE(70,31) I, (COORD(I,K),K=1,TP3) 59 CONTINUE 807 CONTINUE C C WRITE TYPE 8 C ICHK = KPRNT(8) IF(ICHK .EQ. 0) GO TO 242 TYPE=8 WRITE(80,29) SET,TYPE,NPOINT,DIMNR, 1 TDIMNR,CLUSNR,TCLUSN,DENSIT, 2 (SAVENB(I),I=1,ITT) DO 79 I=1,KPNTS WRITE(80,31) I, (COORD(I,K),K=1,TDIMNR) 79 CONTINUE C C STANDARDIZE THE COORDINATE VALUES C 242 CONTINUE IDIM=1 64 SUM1=0.0 SUM2=0.0 DO 62 I=1,KPNTS SUM1=SUM1+COORD(I,IDIM) SUM2=COORD(I,IDIM)*COORD(I,IDIM)+SUM2 62 CONTINUE CMEAN=SUM1/KPNTS SUM1=(SUM1*SUM1)/KPNTS STD=SQRT((SUM2-SUM1)/KPNTS) DO 63 I=1,KPNTS COORD(I,IDIM)=(COORD(I,IDIM)-CMEAN)/STD 63 CONTINUE IDIM=IDIM+1 IF(IDIM.LE.TDIMNR) GO TO 64 C C WRITE TYPE 9 C ICHK = KPRNT(9) IF(ICHK .EQ. 0) GO TO 809 TYPE=9 WRITE(90,29) SET,TYPE,NPOINT,DIMNR, 1 TDIMNR,CLUSNR,TCLUSN,DENSIT, 2 (SAVENB(I),I=1,ITT) DO 68 I=1,KPNTS WRITE(90,31) I, (COORD(I,K),K=1,TDIMNR) 68 CONTINUE 809 CONTINUE C C WRITE TYPE 10 C ICHK = KPRNT(10) IF(ICHK .EQ. 0) GO TO 243 TYPE=10 WRITE(100,29) SET,TYPE,NPOINT,DIMNR, 1 TDIMNR,CLUSNR,TCLUSN,DENSIT, 2 (SAVENB(I),I=1,ITT) DO 73 I=1,KPNTS WRITE(100,31) I, (COORD(I,K),K=1,TDIMNR) 73 CONTINUE C C COMPUTE RANDOM NOISE DATA C 243 CONTINUE NUMBR=KPNTS*TDIMNR C CALL RANDOM (SEED,NUMBR) C ICNT=0 DO 82 I=1,KPNTS DO 83 J=1,TDIMNR ICNT=ICNT+1 COORD(I,J)=NORML(ICNT)*SAVEBD 83 CONTINUE 82 CONTINUE C C WRITE TYPE 11 C ICHK = KPRNT(11) IF(ICHK .EQ. 0) GO TO 811 TYPE=11 WRITE(110,29) SET,TYPE,NPOINT,DIMNR, 1 TDIMNR,CLUSNR,TCLUSN,DENSIT, 2 (SAVENB(I),I=1,ITT) DO 87 I=1,KPNTS WRITE(110,31) I, (COORD(I,K),K=1,TDIMNR) 87 CONTINUE 811 CONTINUE C C END MASTER CONTROL DO-LOOPS C 4 CONTINUE 3 CONTINUE 2 CONTINUE 1 CONTINUE C C FORMAT STATEMENTS C 29 FORMAT(18I4) 31 FORMAT(I3,20F8.3) STOP END C C*********************************************************************** C C UNIFORM [0,1] AND NORMAL N(0,1) RANDOM NUMBER GENERATOR C PACKAGE. C C USAGE: C C IN ANY ROUTINE WHICH IS TO CALL THE GENERATOR, INSERT THE C FOLLOWING COMMON BLOCK: C C COMMON/RAND/RND(9800) FOR THE UNIFORM GENERATOR C COMMON/RAND2/RND2(9800) FOR THE NORMAL GENERATOR C C TO CALL FOR UNIFORM NUMBERS USE: C C CALL RANDOM (SEED,NTOTAL) C C TO CALL FOR NORMAL NUMBERS USE: C C CALL NORMAL (SEED,NTOTAL,LOC) C C WHERE SEED IS A DOUBLE PRECISION RANDOM SEED TO START THE C GENERATOR. SEED SHOULD BE SPECIFIED BEFORE THE FIRST CALL OF THE C ROUTINE AND SHOULD NOT BE CHANGED UNLESS THE USER WISHES TO RESTART C THE RANDOM NUMBER STREAM. NTOTAL IS THE TOTAL NUMBER OF OBSERVATIONS C TO BE GENERATED. AS THE PROGRAM NOW STANDS, UP TO 9800 NUMBERS C CAN BE GENERATED IN ONE CALL, ALTHOUGH THIS IS EASY TO CHANGE. C LOC INDICATES THE STORAGE LOCATION IN RND2 WHERE THE FIRST C GENERATED NUMBER IS TO APPEAR. C C C*********************************************************************** C SUBROUTINE RANDOM(XN,NTOT) C C*********************************************************************** C C RANDOM -- DECENT RND, FROM KNUTH, VOLUME 2, PAGE 88, 11/21/81, DR. N. B. C WAITE. C C RND IS A SIMPLE AND RELATIVELY FAST RANDOM NUMBER ROUTINE. THE C ALGORITHM USED IS ONE FROM KNUTH, "THE ART OF COMPUTING", VOLUME 2, PAGE C 88. AS KNUTH SHOWS, THE RANDOM NUMBERS GENERATED ARE OF GOOD QUALITY. C C THE ALGORITHM IS AS FOLLOWS: TAKE A LARGE INTEGER XN. REPLACE XN C BY XN TIMES 5**15 PLUS 1 MODULO 2**25, I.E., C C XN <-- (XN * 5**15 + 1) MOD 2**35 C C THE FLOATING POINT RANDOM NUMBER RETURNED BY THE FUNCTION IS, THEN, C RND = XN / 2**35. C C THE NUMBERS RETURNED ON CALLS WHERE XN IS NOT CHANGED APPEAR TO BE C INDEPENDENT AND UNIFORMLY DISTRIBUTED ON THE INTERVAL [0,1]. C C XN IS THE RANDOM NUMBER SEED. IT SHOULD BE SPECIFIED AT THE C CALL OF THE ROUTINE AND NOT CHANGED IF THE SAME STREAM OF NUMBERS IS C DESIRED. C C RAND IS AN ARRAY OF UP TO 9800 RANDOM NUMBERS AS GENERATED BY THE C PROGRAM. C C NTOT IS THE TOTAL NUMBER OF NUMBERS TO BE GENERATED. C C MOST OF THE ARITHMETIC REQUIRED IS ARITHMETIC ON LARGE WHOLE C NUMBERS. THIS ARITHMETIC CAN BE PERFORMED IN DOUBLE PRECISION FLOATING C POINT AS LONG AS THE WHOLE NUMBERS ARE SUFFICIENTLY SMALL. USING THE C DMOD FUNCTION THREE TIMES KEEPS THE NUMBERS SUFFICIENTLY SMALL. C C WRITTEN IN FORTRAN 66 BY DR. N. B. WAITE ON NOVEMBER 21, 1981. C MODIFIED BY GLENN W. MILLIGAN ON FEBRUARY 15, 1982. C C*********************************************************************** C DOUBLE PRECISION XN, C C COMMON/RAND/ RND(9800) C DATA C / 34359738368.0D0 / C DO 200 I=1,NTOT XN=DMOD(DMOD(DMOD(XN*3125,C)*3125,C)*3125+1,C) RND(I)=XN/C 200 CONTINUE RETURN END C C*********************************************************************** C SUBROUTINE NORMAL(SEED,NTOT,NST) C C*********************************************************************** C C THIS ROUTINE GENERATES NORMAL RANDOM NUMBERS N(0,1). THE METHOD C USED IS BY BOX, G. E. P. AND MULLER, M. E. "A NOTE ON THE C GENERATION OF RANDOM NORMAL DEVIATES", ANNALS OF MATHEMATICAL C STATISTICS, VOL. 29, 610-611. ALTHOUGH IT IS KNOWN THAT THIS C PROCEDURE DOES NOT GENERATE ENOUGH VALUES IN THE TAILS, C THIS WAS NOT CONSIDERED IMPORTANT SINCE THESE ARE TRUNCATED C IN ANY EVENT. C C THE ROUTINE CALLS A UNIFORM [0,1] GENERATOR. IN THE PRESENT CASE, C THIS IS THE DECENT RND ROUTINE BY KNUTH. C C SEED IS A REAL*8 (DOUBLE PRECISION) NUMBER. NTOT IS THE TOTAL C NUMBER OF NORMAL DEVIATES TO BE GENERATED. C C*********************************************************************** C COMMON/RAND/ RND(9800) COMMON/RAND2/RND2(9800) C DOUBLE PRECISION SEED C NNN=NTOT IF(NTOT.EQ.1) NNN=2 CALL RANDOM(SEED,NNN) NNTOT=NST+NTOT-1 DO 1 K=NST,NNTOT,2 RND2(K )=SQRT(-2.*ALOG(RND(K-NST+1)))* 1 COS(6.2831853*RND(K-NST+2)) RND2(K+1)=SQRT(-2.*ALOG(RND(K-NST+1)))* 2 SIN(6.2831853*RND(K-NST+2)) 1 CONTINUE RETURN END