MINITAB BASICS FOR THE MAC

Dr. Nancy Pfenning
May 2014

After starting MINITAB, you'll see a Session window above and a worksheet below. The Session window displays both graphs and non-graphical output such as tables of statistics and character graphs. A worksheet is where we enter, name, view, and edit data. The Navigator bar at the left enables you to access any of the summaries or graphs produced during your session. You can highlight a specific item and right-click to Delete or Export as PDF or HTML.

The menu bar across the top contains the main menus: File, Edit, Data, Graph, Statistics, View, Window, and Help. Beneath each item in the menu bar is a drop-down list of important actions.

In the instructions that follow, text to be typed will be underlined. Menu instructions will be set in boldface type with the entries separated by pointers. Variable names begin with a capital letter.

STORING DATA

Each data set is stored in a column, designated by a "C" followed by a number. For example, C1 stands for Column 1. The column designations are displayed along the top of the worksheet. The numbers at the left of the worksheet represent positions within a column and are referred to as rows. Each rectangle occurring at the intersection of a column and a row is called a cell. It can hold one observation.

The active cell has the worksheet cursor inside it and a blue rectangle around it. To enter or change an observation in a cell, we first make the cell active and then type the value.

Directly below each column label in the worksheet is a cell optionally used for naming the column. To name the column, we click on this cell and type the desired name.

Whenever a variable name is to be entered in a text box, instead of typing it directly, you may double-click on its name in the box on the left.

Example A: Suppose we want to store heights, in inches, of female class members [64, 65, 61, 70, 65, 66, ...] into column C1 and name the column "FHts". Just click in the name cell for this column, type FHts, and press the "Enter" key. Then type 64, Enter, 65, Enter, 61, Enter, and so on. Note that a height of ``5 foot 7" would be entered as 67, and ``6 foot 1" would be 73.

Example B: To store male heights, name column C2 "MHts" and enter those data values in this column.

DESCRIPTIVE STATISTICS AND GRAPHS

Note: It is possible to opt out of unwanted summaries by choosing Statistics from the middle of the upper box under Descriptive Statistics, and unchecking them.

Example C: For sample size N, number of non-responses N*, Mean, SE Mean, StDev, Minimum, Q1, Median, Q3, and Maximum of female height data,

  1. Choose Statistics>Summary Statistics>Descriptive Statistics...
  2. Specify FHts in the Variable text box (instead of typing it directly, you may double-click on FHts in the box on the left).
  3. Click OK .

For histogram(D), stemplot(E), and boxplot(F) of female height data,

Example D:

  1. Choose Graph>Histogram...
  2. Click on the Simple histogram (on the left).
  3. Specify FHts in the Variables text box.
  4. Click OK.

Example E:

  1. Choose Graph>Stem-and-Leaf Plot...
  2. Specify FHts in the Variables text box.
  3. Click OK.

Example F:

  1. Choose Graph>Boxplot...
  2. Click on the Simple boxplot, under Single Y Variable (upper left).
  3. Specify FHts in the Variable text box.
  4. Click OK.

To produce side-by-side boxplots of male and female heights,

  1. Choose Graph>Boxplot...
  2. Click on the Simple boxplot under Multiple Y Variables (lower left).
  3. Specify FHts and MHts in the Variables text box.
  4. Click OK.

Example G: To combine and sort female and male class members' heights,

  1. Choose Data>Stack Columns....
  2. Specify FHTS and MHTS with a space between them as columns to be stacked.
  3. Click OK, resulting in a new column called Stack.
  4. Choose Data>Sort.
  5. Specify Stack in the Columnns to sort: text box, and also Stack in the Columns to sort by: box. Do not select Store the sorted data in the original columns.
  6. Click OK, resulting in a new column called Sorted Stack; consider renaming this Sorted_Hts.

In Example G, you can also opt to "Store the sorted data in the original columns." Unlike its predecessors, Minitab for the Mac doesn't give the user additional options for where the stacked data should be stored, such as into a new worksheet, or into a new column specified with a new column name. To store in a new worksheet, simply cut the column, open a new worksheet, and paste it in. The column can be given a more meaningful name by accessing and changing it directly in the worksheet.

The remaining examples work with existing data that are to be downloaded into MINITAB. Data for dozens of variables about hundreds of students can be accessed on Dr. Pfenning's website http://www.pitt.edu/~nancyp/stat-0200/index.html where the file name is highlighted. To download into MINITAB, type ctrl A to highlight and ctrl C to copy. Start up MINITAB [or if it's already running, choose File>New to open up a new worksheet] , type ctrl V to paste it. Important: When you paste the data, have the cursor on the blank shaded cell under C1 but above Row 1. This puts the column names where they belong, so they will not be treated as data values.

Example H Suppose all heights are entered in a single column Height, and genders (male or female) are entered in the column Gender. To compare heights of students in the two gender groups,

  1. Choose Statistics>Summary Statistics>Descriptive Statistics...
  2. Specify Height in the Variable text box.
  3. Specify Gender in the Group variable text box.
  4. Choose Display and check Boxplot.
  5. Click OK.

Now suppose all earnings are entered in a single column Earned, and Year contains values 1, 2, 3, 4, and Other. To compare earnings of students in Years 1 to 4 only (if for some reason the Others are to be omitted),

  1. Choose Data>Unstack Columns.
  2. Specify Earned for Unstack the data in and Year for Unstack using the values in:
  3. Click OK.
  4. Obtain desired descriptive statistics and displays for Earned_1 to Earned_4. [Boxplots would be Simple under Multiple Y Variables as in the second part of Example F.]

Observation: in Example H, the new columns are automatically called Earned_1 through Earned_Other and are stored in additional columns at the end of the same worksheet. Unlike its predecessors, Minitab for the Mac doesn't give users the option of storing the data in a new worksheet.

Example I Suppose all heights were entered in a single column Height, and genders (M or F) were entered in the column Gender. To produce side-by-side boxplots of male and female heights,
  1. Choose Graph>Boxplot...
  2. Click on the With Groups, under Single Y Variable (upper right).
  3. Specify Height in the Variable text box and Gender in the Group variable text box.
  4. Click OK.

RANDOM SAMPLING

Example J We can use MINITAB to take a random sample of, say, 10 heights from those in a data column.

  1. Choose Data>Sample from Columns.
  2. Specify Height in the Take a sample from the following columns: box and type 10 for the Number of rows of each sample box.
  3. For most purposes, keep the default Sample without replacement option.
  4. Click OK. (Minitab stores the data in a new column called Sample From Height.)

Note: for independent samples (such as for two-sample t or ANOVA), perform the above steps twice. To sample pairs of values (such as for paired t or regression), two columns of equal length can be specified (eg. MOMAGE and DADAGE).

Example K: We can also use MINITAB to randomly select 5 from 100 names in a hard-copy list. Assume the names are listed alphabetically, where the first name corresponds to the number 1 and the last corresponds to the number 100.

  1. Choose Data>Generate Patterned Data>Numeric...
  2. Keep the default Equally spaced numbers.
  3. For the First number type 1.
  4. For the Second number type 100, keeping the default of 1 as the Size of each step.
  5. Click OK and the numbers will be stored in a new column called Numeric Pattern.
  6. Choose Data>Sample From Columns...
  7. Specify Numeric Pattern in the Take a sample from the following columns: box.
  8. Type 5 in the small text box after Number of rows in each sample:
  9. Leave default Method: as Sample without replacement.
  10. Click OK and the random sample of 5 numbers is stored in a new column called Sample From Numeric Pattern.

STATISTICAL INFERENCE; CONFIDENCE INTERVALS

Note: Confidence intervals are automatically provided in the output for a hypothesis test, but it will not be the standard confidence interval unless the two-sided alternative has been selected.

I see you took my suggestion to replace One-Sample Hypothesis Tests with 1-Sample Inference but maybe my other suggestion 1-Sample Tests or Intervals is better? Same goes for 2-Sample Tests or Intervals.

Minor observation: why is it capital Z-test and lower-case t-test?

Example L: Assume Verbal SAT scores of surveyed students to be a random sample taken from scores of all Pitt students, whose mean score is unknown [actually, it is about 625] and standard deviation is assumed to be 100. Use sample scores to obtain a 90% confidence interval for population mean score.

  1. Choose Statistics>1-Sample Inference>Z...
  2. Keep the default Sample data in a column and specify VerbalSAT in the Sample box.
  3. Type 100 in the Known standard deviation text box.
  4. Select the Options button from the top to change from the default 95% level.
  5. In the Confidence level box type 90.
  6. Make sure Alternative hypothesis is at the default Mean not equal hypothesized value.
  7. Click OK.

Example M: Assume Verbal SAT scores of surveyed students members to be a random sample taken from scores of all Pitt students, whose mean and standard deviation are unknown. Use sample scores to obtain a 99% confidence interval for population mean score.

  1. Choose Statistics>1-Sample Inference>t...
  2. Specify VerbalSAT in the Samples box.
  3. Select the Options button from the top.
  4. Click in the Confidence level text box and type 99.
  5. Make sure Alternative is at the default Mean not equal hypothesized value.
  6. Click OK.

STATISTICAL INFERENCE; HYPOTHESIS TESTS

Example N: Test the null hypothesis that Verbal SAT scores of surveyed students are a random sample taken from a population with mean 600 against the alternative that the mean is greater than 600. Assume population standard deviation to be 100. [If population standard deviation were not assumed to be known, a 1-Sample t test would be used, and Standard deviation would not be specified.]

  1. Choose Statistics>1-Sample Inference>Z...
  2. Specify VerbalSAT in the Sample box.
  3. Click in the Known standard deviation: text box and type 100.
  4. Check the Perform hypothesis test box and enter 600 in the Hypothesized mean box.
  5. Select the Options button from the top.
  6. Under Alternative select Mean greater than hypothesized value.
  7. Click OK.

Observation: Unlike its predecessors, for paired and two-sample tests, Minitab for the Mac no longer provides the option of comparing the mean of differences, or the difference between means, to any number other than zero. This might be OK for most of us, although it is conceivable that we might want to make other comparisons, such as if dads average more than 2 years older than moms, or if male students' mean weight is over 25 pounds more than female students.

Example O: Do students' dads tend to be older than their moms? Test the null hypothesis that the mean of differences: (ages of dads minus ages of moms) for the larger population is zero vs. the alternative that the mean of differences is positive.

  1. Choose Statistics>2-Sample Inference>Paired t...
  2. Keeping the default Each sample is in a column, specify DadAge in the Sample 1 text box.
  3. Specify MomAge in the Sample 2 text box.
  4. Click in the Options button at the top.
  5. Click the arrow button at the right of the Alternative hypothesis drop-down list box and select Mean difference greater than 0.
  6. Click OK.

Example P: Use MINITAB to verify that female heights are significantly less than male heights. Procedure may or may not be pooled.

  1. Choose Statistics>2-Sample Inference>t...
  2. Keep the default Both samples are in one column and enter Height for Samples and Gender for Sample IDs...
  3. Click in the Options button at the top.
  4. Click the arrow button at the right of the Alternative hypothesis drop-down list box and select Difference less than 0. (MINITAB considers the difference Females minus Males, with Females first because F comes before M in the alphabet).
  5. If sample standard deviations are close and you have reason to assume equal population variances, you may select the Assume equal variances check box, which carries out a pooled procedure. Otherwise, unselect it.
  6. Click on Display and select Boxplot.
  7. Click OK.

Alternatively, the data may occur in two columns of height values, one for each sex.

  1. Select the Each sample is in its own column option button if that is the case.
  2. In the Sample 1 text box, specify FHeights.
  3. In the Sample 2 text box, specify MHeights.
  4. Proceed as above.

REGRESSION

Example Q: Use MINITAB to examine the relationship between ages of students fathers and ages of their mothers; after verifying the linearity of the scatterplot, find the correlation r and the regression equation; produce a fitted line plot. Produce a plot of residuals vs. the explanatory variable (MomAge). Produce a scatterplot showing bands for confidence intervals and prediction intervals. Obtain a confidence interval for the mean height of all fathers when mothers are 40, and a prediction interval for an individual father when the mother is 40 years old.

  1. Choose Graph>Scatterplot... and click on Simple.
  2. Specify DadAge in the Y variable text box.
  3. Specify MomAge in the X variable text box.
  4. Click OK.
  5. Choose Statistics>Regression>Correlation...
  6. Specify MomAge and DadAge in the Variables text box.
  7. Click OK.
  8. Choose Graph>Scatterplot and click on With regression.
  9. Specify DadAge in the Y variable text box.
  10. Specify MomAge in the X variable text box.
  11. Click OK.
  12. Choose Statistics>Regression>Simple Regression.
  13. Specify DadAge in the Response (Y): text box.
  14. Specify MomAge in the Predictor (X): text box.
  15. Click on the Graphs box at the top.
  16. Check the Residuals versus variables: box and specify MomAge.
  17. Click OK.
  18. Choose Statistics>Regression>Simple Regression...
  19. Specify DadAge in the Response (Y): box.
  20. Specify MomAge in the Predictor (X): box.
  21. Choose the Options button and click in both Display 95% confidence interval and Display 95% prediction interval.
  22. Click OK.
  23. Choose Stat>Regression>Regression>Predict.
  24. Verify DadAge appears in the Response text box.
  25. Verify MomAge appears below.
  26. Type 40 in first line of MomAge box.
  27. Click OK.

ANALYSIS OF VARIANCE (ANOVA)

Example R: Use MINITAB to see if there is a significant difference in mean earnings of freshmen, sophomores, juniors, and seniors in the class. Include side-by-side boxplots to display the data.

  1. First unstack earnings according to year (see Example H).
  2. Choose Statistics>ANOVA>One-Way.
  3. Choose Responses are in a separate column for each factor level.
  4. Specify Earned_1, Earned_2, Earned_3, Earned_4 in the Responses text box.
  5. Click on the Graphs box
  6. Check the box for Boxplot. (Note that the Confidence interval plot is provided by default.
  7. Click OK.

You may also compare mean responses of stacked data as it appears in the original worksheet by specifying Earned in the Response box and Year as the Factor variable, using Statistics>ANOVA>One Way and Responses are in one column for all factor levels. In this case, the ``Other" students cannot be omitted.

SINGLE PROPORTIONS

Example S: Use MINITAB to do inference about the population proportion of males/females. [The following only works for categorical variables like Gender that have just 2 possibilities.]

  1. Choose Graph>Pie Chart... and enter Gender as the Categorical variable.
  2. Keep the default Counts of unique values in a categorical variable.
  3. Click OK.
  4. Choose Statistics>1-Sample Inference>Proportion...
  5. Keep default Sample data in a column.
  6. Specify Gender in the Sample box.
  7. Choose female in the drop-down Event box below.
  8. Check Perform hypothesis test.
  9. Type 0.5 in the Hypothesized proportion box.
  10. Click on Options at the top to specify a one-sided alternative or to use another confidence level besides 95% or to opt for Method to be a normal approximation.
  11. Click OK.

Example T: Use MINITAB to do inference about the population proportion preferring a certain color. These steps may be followed if the variable of interest has more than 2 possibilities.

  1. Choose Graph>Pie Chart... and enter FavoriteColor as the Categorical variables.
  2. Click OK.
  3. Choose Statistics>Summary Statistics>Tally....
  4. Specify FavoriteColor in the Variable box.
  5. Check Counts under Statistics.
  6. Click OK.
  7. Note the Count in the FavoriteColor of interest (this counts the Events) and the total count N (this counts the Trials).
  8. Choose Statistics>1-Sample Inference>Proportion.
  9. Choose Summarized data from the drop-down menu at the top.
  10. Specify the Number of events and Number of trials as reported by Minitab in the earlier step.
  11. Check Perform hypothesis test and type 0.125 as the hypothesized proportion
  12. Click on Options at the top and specify a one-sided alternative if you suspected more or fewer than 1/8 would prefer that color. Under Method check "Normal approximation" if you want your results to be consistent with calculations by hand.
  13. Click OK.

TWO-WAY TABLES and CHI-SQUARE

Example U: Use MINITAB to check for a relationship between gender and year at Pitt.

  1. Choose Statistics>Tables>Cross Tabulation and Chi-Square.
  2. Keep the default Raw data (categorical variables).
  3. Decide which should be the explanatory variable; in this case, it would be Gender. Specify Gender for Rows and Year for Columns.
  4. Choose Display from the top. For simple data analysis, check Percent of row total under Percents to display in each cell. The row percents are conditional percentages for respective values of the explanatory variable.
  5. For statistical inference, under Display check the Chi-Square test for association box.
  6. Click OK.
  7. Choose Graph>Bar Chart...
  8. Click on Clustered.
  9. Enter Gender and Year as the Categorical variables (Gender first because it is the explanatory variable, graphed horizontally).
  10. Click OK.

If a two-way table has been created to summarize the data (as in the Cross Tabulation option) you may enter the counts directly into r rows (where r is the number of possibiities for the explanatory variable) and c columns (where c is the number of possibilities for the response variable) in a Minitab worksheet. For instance, for the first (Female) row enter 32 for the 1st (Year) column, 196 for the 2nd column, 71 for the 3rd, 25 for the 4th, and 7 for Other. For the second (Male) row enter 13, 114, 62, 28, 11, respectively, for the five columns 1st through Other. Then choose Statistics>Tables>Cross Tabulation and Chi-Square and select Summarized data in a two-way table from the drop-down menu. Then enter the five column names 1st through Other in the box Columns containing the table. Request Chi-square test for Association under the Display menu and Click OK.