RSBOOT
A Program to Calculate the A Shape Coefficient for Rank-Size Plots
with Error Ranges for Specified Confidence Levels

Robert D. Drennan

Revised, March 2, 2019

RSBOOT is a utility program to calculate the A coefficient of shape for rank-size plots with error ranges for specified confidence levels, as proposed by Drennan and Peterson (2004, "Comparing Archaeological Settlement Systems with Rank-Size Graphs: A Measure of Shape and Statistical Confidence" Journal of Archaeological Science 31:533-549). The A coefficient is based on areas in a rank-size plot, normalized so that the area of the square defined by the log-normal line as its diagonal is equal to 2.0. RSBOOT approximates the areas involved by calculating the areas of a series of rectangles defined by the data points along the observed and log-normal lines. Input to the program is an ASCII text file of settlement sizes for any number of settlements up to 1000. (For datasets consisting of more than 1000 settlements, Drennan and Peterson suggest calculating the statistic on the basis of the largest 1000 settlements.) Output includes the A coefficient and error ranges for 99%, 95%, 90%, 80%, and 66% confidence levels. Optionally, the user can request an error range for any desired confidence level and a data file for plotting a rank-size graph with a confidence zone for a specified confidence level. RSBOOT is written in FORTRAN and runs in the command-line (DOS) window of Windows operating systems. It is known to work properly in Windows XP, Windows 7, and Windows 10 (32 or 64-bit). It may be necessary to be quite insistent to convince Windows and anti-virus software that you really do want to runt this program from an unknown source.

To use RSBOOT, first download the program and save the file RSBOOT.EXE in the same folder where the data files to be analyzed are. No further installation is necessary. Just double-click RSBOOT.EXE to run the program. The command-line (DOS) window will open, and you will be asked a series of questions:

1. "Enter name of file to read settlement sizes from:" Type in the name of the file from which the settlement data are to be read. File names must consist of no more than eight characters and may not contain spaces or most punctuation. If the file has an extension, the extension must contain no more than three characters, and the extension must be entered, separated from the file name by a period (no spaces). This must be an ASCII text file with one settlement size on each line. Such a file can be created with a word processor or text editor (Word, WordPerfect, WordPad, Notepad, etc.), with a spreadsheet (Excel, etc.), or with a database manager (Access, etc.). Simply specify ASCII Text as the File Type when saving the file for RSBOOT. (See below for a note on working with ASCII files.) After typing the file name and extension, type <Enter>.

2. "Enter name of file to write results to:" Type in the name of the file for the A coefficient and its error ranges to be saved in. (This output will also appear on the screen.) The same constraints about file names apply as in 1. above.

3. "Enter a positive 7-digit integer to seed the random number generator:" Enter any 7-digit whole number greater than 0.

RSBOOT will then calculate the A coefficient and error ranges for 99%, 95%, 90%, 80%, and 66% confidence levels and write these results on the screen and to the specified file for saving results. The number of settlements observed is given so that complete reading of the input data file can be verified.

4. "To produce an error range for some other specific confidence level, or to produce a data file for plotting confidence zones, type Y now (or just type <Enter> to close):" If no further results are desired, simply type <Enter>, and the window will close. Type Y<Enter> if either an error range for some confidence level other than those already provided is desired or to get a data file for plotting a rank-size graph with a confidence zone for a specified confidence level.

5. "Enter name of file to write plotting data to:" Enter the name of the file (as before) where the plotting data are to be saved.

6. "Enter desired confidence level % (as an integer between 1 and 99):" For the 90% confidence level, for example, type 90<Enter>.

RSBOOT will calculate the error range for the specified confidence level and write this result on the screen (it is not saved in a file). It will then save the values for five variables in the specified file. The file will begin with this text on the first line: "RANK, SETSIZE, LOGNORM, SIZEUP, SIZELO". Subsequent lines will contain values for these five variables, separated by commas. In the first column is the settlement rank, beginning with 1 for the largest settlement. In the second column is the size of the observed settlement. In the third column is the expected size of the settlement, according to the rank-size rule (this variable comprises points along the log-normal line). In the fourth column is a set of settlement sizes that produces a rank-size curve with an A value approximately that of the upper cut point of the error range for the specified confidence level. In the fifth column is a set of settlement sizes that produces a rank-size curve with an A value approximately that of the lower cut point of the error range for the specified confidence level. These last two variables are selected from the 1000 resamples created by RSBOOT. The A values that correspond to them may differ from the cut point of the error range because RSBOOT will only select a resample that includes the largest settlement, so that the rank-size curve can be plotted to complement the observed shape. This data file can be easily read by most general purpose statistical software to produce a rank-size plot with confidence zone. The file is a comma-delimited ASCII text file; many programs will expect it to have the extension .TXT. Usually the names on the first line of the file will be taken as variable names; if this interferes with reading by some program, the file may be opened with a word-processing or text-editing program (Word, WordPerfect, WordPad, Notepad, etc.) and the first line deleted. With SYSTAT, for example, it is necessary only to select File>Open>Data and choose ASCII Text (*.txt) in the Files of Type drop-down list. The file will open with five variables, named RANK, SETSIZE, LOGNORM, SIZEUP, and SIZELO. The following command produces a rank-size plot with a confidence zone: PLOT SETSIZE LOGNORM SIZEUP SIZELO*RANK / OVERLAY LINE XLOG=10 YLOG=10 STICK=OUT COLOR=1,2,10 SIZE=0.000,0.000,0.000. It could simply be copied from here and pasted into SYSTAT's Interactive window.

As with all resampling applications, results of multiple runs of RSBOOT with different random number seeds will differ slightly, especially when the sample is small in the first place. It is a good idea to run RSBOOT several times with different random number seeds if the sample is less than 25 or 30 just to avoid using the first result if that turns out to be quite unusual. After several runs, it should be clear what A-values are usual for the cut-points of error ranges for different confidence levels and what curve shapes are usual for confidence zones in the plots. With this knowledge, the extraordinary or unusual results can be avoided when picking one to use as reliable.

Working with ASCII (text) files has its quirks. Programs like Access and Excel are pretty good at exporting them. The "comma-delimited ASCII file" (with the extension .CSV) is usually a good choice. If you need to edit your input file before running RSBOOT, there is something to be said for using a very simple file editor, like WordPad or Notepad because they do fewer strange things to files without telling you than more powerful word processors do. On the other hand, a more powerful word processor like Word makes it easy to see non-printing characters that may cause problems when RSBOOT tries to read your input file. (With Word, select Tools, then Options, then the View tab, and check the box by All in the Non-Printing Characters section of the dialog box.) The input file should contain nothing except a single number on each line; each line should end with a carriage-return (which will show up as the non-printing paragraph symbol in Word); and the last line in the file should have a single carriage-return character. It is important to be sure that whatever program you use does save the file as an ASCII or "text" file. If you use Word, you must be sure not only to save the data file, but also to close Word before running RSBOOT or Word will leave the file marked as open and not allow RSBOOT to read it. Common extensions for text files are .TXT, .DAT, or .CSV. RSBOOT will work with any extension, but you must tell RSBOOT what the extension is. (It is exceedingly practical here to open Windows Explorer and select Tools, then Options, and, under the View tab, remove the check from the box by Hide file extensions for known file types. This way you can see in Windows Explorer what the extension to a filename actually is.)