Statistics in a Modern World 800
Assignment 2

Homework Exercises Assigned from Part 2 (50 pts.) due Wed., October 9 in Lecture

CHAPTER 7

#1 (4 pts.) At the beginning of this chapter, the following exam scores were listed: 75, 95, 60, 93, 85, 84, 76, 92, 62, 83, 80, 90, 64, 75, 79, 32, 78, 64, 98, 73, 88, 61, 82, 68, 79, 78, 80, 55.

  1. On a separate sheet of paper (staple it to the back), create a stemplot for the test scores.
  2. The shape as shown by the stemplot is (i) symmetric (ii) skewed right (iii) skewed left
  3. The shape is (i) unimodal (ii) bimodal
  4. Are there any outliers? If so, tell what they are.

#2. (2 pts.) Refer to the test scores in Exercise 1.

  1. List the values of the Five Number Summary, from lowest to highest.
  2. On your separate sheet of paper, create a boxplot.

#3 (1 pt.) On your separate sheet of paper, create a histogram for the test scores.

#8 (2 pts.) Find the mean and standard deviation of these numbers: 10, 20, 25, 30, 40.

#12 (2 pts.) In each of the following cases, which would probably be higher, the mean or the median, or would they be about equal?

  1. Salaries in a company employing 100 factory workers and 2 highly paid executives
  2. (i) mean higher (ii) median higher (iii) about equal
  3. Ages at which residents of a suburban city die, including everything from infant deaths to the most elderly (i) mean higher (ii) median higher (iii) about equal
  4. Heights of all 7-year-old children in a large city (i) mean higher (ii) median higher (iii) about equal
  5. Shoe sizes of adult women (i) mean higher (ii) median higher (iii) about equal

CHAPTER 8

# 1 (1.5 pts.) Using Table 8.1 p.137, determine the percentage of the population falling BELOW each of the following standard scores:

  1. —1.00
  2. 1.96
  3. 0.84

#2 (1.5 pts.) Using Table 8.1, determine the percentage of the population falling ABOVE each of the following standard scores:

  1. 1.28
  2. —0.25
  3. 2.33

#3 (2 pts.) Using Table 8.1, determine the standard score that has the following percentage of the population BELOW it:

  1. 25%
  2. 75%
  3. 45%
  4. 98%

#4 (2 pts.) Using Table 8.1, determine the standard score that has the following percentage of the population ABOVE it:

  1. 2%
  2. 50%
  3. 75%
  4. 10%

#5 (1 pt.) Using Table 8.1, determine the percentage of the population falling between the two standard scores given:

  1. —1.28 and 1.75
  2. 0.0 and 1.0

#9 (1 pt.) Stanford-Binet IQ’s have mean 100, standard deviation 16. Mensa is an organization that allows people to join only if their IQ’s are in the top 2% of the population. What is the lowest Stanford-Binet IQ you could have and still be eligible to join Mensa? Show your work.

 

 

 

 

 

#14 (1 pt.) A graduate school program in English will admit only students with GRE verbal ability scores in the top 30%. What is the lowest GRE score they will accept? (Recall the mean is 497 and the standard deviation is 115.) Show your work.

 

 

 

CHAPTER 9

#6 (1 pt.) Using the data shown in Table 9.1 p.154, draw a bar graph presenting the information. Be sure to include all the components of a good statistical picture. [Do this on your separate sheet of paper.]

#10 (1 pt.) Use Table 9.3 to draw a pie chart illustrating the blood-type distribution for white Americans, ignoring the RH factor. [Do this on your separate sheet of paper.]

CHAPTER 10

#2 (1 pt.) In Figure 10.2, we observed that the correlation between husbands’ and wives’ heights, measured in millimeters, was .36. What would the correlation be if the heights were converted to inches?

#4 (2 pts.) Are each of the following pairs of variables likely to have a positive correlation or a negative correlation?

  1. Daily temperature at noon in New York City and in Boston. (i) positive (ii) negative
  2. Weight of a car and its mean gas mileage (mean miles per gal) (i) positive (ii)negative
  3. Hours of television watched and GPA for college students (i) positive (ii) negative
  4. Years of education and salary (i) positive (ii) negative

#5 (1 pt.) Suppose a weak relationship exists between two variables in a population. Which would be more likely to result in a statistically significant relationship between the two variables? (I) a sample of size 100 (II) a sample of size 10,000

#10 (a) (1 pt.) The regression line relating verbal SAT scores and GPA for the data exhibited in Figure 9.5 is GPA = 0.539 + (0.00362)(verbal SAT). Predict the average GPA for those with verbal SAT scores of 500.

#13. (1 pt.) Use one key word to tell why we should not use the regression equation we found in exercise 12 p. 174 for speed-skating time versus year to predict the winning time for the 2002 Winter Olympics.

CHAPTER 11

#3. (3 pts.) An article in Science News (vol.149, 1 June, 1996, p.345) claimed that "evidence suggests that regular consumption of milk may reduce a person’s risk of stroke, the third leading cause of death in the U.S." The claim was based on an observational study of 3150 men, and the article noted that the researchers "report strong evidence that men who eschew [give up] milk have more than twice the stroke risk of those who drink 1 pint or more daily." The article concluded by noting that "those who consumed the most milk tended to be the leanest and the most physically active."

  1. Circle the word or words that tell the explanatory variable.
  2. Underline the word or words that tell the response variable.
  3. Pick one of the seven reasons listed on page 186 which provides a good explanation for the relationship. (Circle it.) 1 2 3 4 5 6 7

#4 (2 pts.) Iman (1994, p. 505) presents data on how college students and experts perceive risks for 30 activities or technologies. Each group ranked the 30 activities. The rankings for the eight greatest risks, as perceived by the experts, are shown in Table 11.5.

  1. Prepare a scatterplot of the data, with students’ ranks on the vertical axis and experts’ ranks on the horizontal axis. [Use your separate sheet of paper.]
  2. Another technology listed was nuclear power, ranked first by the students and 20th by the experts. If nuclear power were added to the list, do you think the correlation between the two sets of rankings would increase or decrease?

#6 (1 pt.) Which one of the seven reasons for relationships listed in Section 11.3 p.186 is supposed to be ruled out by designed experiments?

#12 (6 pts.) Suppose a positive relationship had been found between each of the following sets of variables. In Section 11.3 p.186, seven potential reasons for such relationships are given. Circle the one of the seven reasons which is most likely to account for the relationship in each case.

  1. Number of deaths from automobiles and beer sales for each year from 1950 to 1990. 1 2 3 4 5 6 7
  2. Number of ski accidents and average wait time for the ski lift for each day during one winter at a ski resort. 1 2 3 4 5 6 7
  3. Stomach cancer and consumption of barbecued foods, which are known to contain carcinogenic (cancer-causing) substances 1 2 3 4 5 6 7
  4. Self-reported level of stress and blood pressure 1 2 3 4 5 6 7
  5. Amount of dietary fat consumed and heart disease 1 2 3 4 5 6 7
  6. Twice as many cases of leukemia in a new high school, built near a power plant, than at the old high school 1 2 3 4 5 6 7

CHAPTER 12

#3 (3 pts.) According to the University of California at Berkeley Wellness Letter (Feb 1994, p.1), only 40% of all surgical operations require an overnight stay at a hospital.

  1. The proportion of surgical operations requiring an overnight stay is…
  2. The risk of requiring an overnight stay is…
  3. The odds of requiring an overnight stay are…

#13 (3 pts.) Reporting on a study of drinking and drug use among college students in the U.S., a Newsweek reporter wrote: "Why should college students be so impervious to the lesson of the morning after? Efforts to discourage them from using drugs actually did work. The proportion of college students who smoked marijuana at least once in 30 days went from one in three in 1980 to one in seven last year [1993]; cocaine users dropped from 7% to 0.7% over the same period. " (19 December 1994, p.72)

  1. What was the relative risk of cocaine use for college students in 1980 compared with college students in 1993?
  2. Are the figures for marijuana use (for example, "one in three") presented as proportions or odds?
  3. Is the statement that "efforts to discourage them from using drugs actually did work" justified? Answer yes or no and explain briefly.

#16 (2 pts.) Compute the chi-squared statistic for the relationship between bird ownership and lung cancer, based on the data in Exercise 15 p.222.

Is there statistically significant evidence of a relationship?

CHAPTER 13

#1 (1 pt.) The price of a first-class stamp in 1970 was 8 cents, whereas in 1997 it was 32 cents. The Consumer Price Index for 1970 was 38.8, whereas for 1997 it was 160.5. If the true cost of a first-class stamp did not increase between 1970 and 1997, what should it have cost in 1997? In other words, what would an 8 cent stamp in 1970 cost in 1997, when adjusted for inflation?

 


 

PRACTICE PROBLEMS FOR CHAPTER 14

Should be done for practice but not handed in.

#1 For each of the following time series, do you think the long-term trend would be positive, negative, or nonexistent?

  1. The cost of a loaf of bread measured monthly from 1960 to 1998.
  2. The temperature in Boston measured at noon on the first day of each month from 1960 to 1998.
  3. The price of a basic computer, adjusted for inflation, measured monthly from 1970 to 1998.
  4. The number of personal computers sold in the U.S. measured monthly from 1970 to 1998.

#2. Which of the four time series in Exercise 1 would have the strongest seasonal component?

#18 According to the World Almanac and Book of Facts (1995, p.380), the population of Austin, Texas (reported in thousands), has grown as follows:

Year 1950 1960 1970 1980 1990

Population 132.5 186.5 253.5 345.5 465.6

  1. Of the three nonrandom components of time series (trends, seasonal, and cycles), which do you think would be most likely to explain the data if you were to see the population of Austin, Texas, by month, from 1950 to 1990?
  2. The regression equation relating the last two digits of each year (50,60, and so on) to the population for Austin, Texas, is: population = -301 + 8.25(year)
  3. Use this equation to predict the population of Austin for the year 2000.
  4. Consider the method you used for the prediction in part b. Do you think it is likely to be accurate? Would the same method continue to give accurate predictions for the years 2010, 2020, and so on?


[ Home | Calendar | Assignments | Handouts ]