
Basic Applied Statistics 1000
Solutions to Practice Midterm 1

 (ii)Crunchy (stemplot is centered around higher values)
 (ii)Crunchy's somewhat more variable (stemplot has more spread)
 (ii) Crunchy (stemplot has longer right tail, whereas Creamy has a
slightly longer left tail, when configured on an axis with lower values
to the left, higher values to the right)
 1st, 5th, average of 9th and 10th, 14th, 18th: 34,42,51,62,80
 IQR=6242=20; Q11.5(IQR)=421.5(20)=12
 yes
 (i) Creamy (min is 22, max is 68, etc.)

 (ii) histogram (1 quantitative variable)
 (ii) mean and standard deviation, because the distribution should
be fairly normal

 (iv) scatterplot (2 quantitative variables)
 (iv) report the correlation

 (iii) sidebyside boxplots (1 quantitative variable plus 1 categorical
variable)
 (iii) Compare Five Number Summaries (there could be right skewness/high
outliers)

 .9505 (look up proportion LESS than +1.65)
 approximately 1
 .0160 (subtract proportion for 1.4 from proportion for 1.3; proportions
CANNOT be negative, so I took off 3 pts. for an answer of .016)
 Look up a proportion of .8000 below, and find z=.84
 Look up a proportion of .1000 and find z=1.28

 (iii) 3 is a parameter mu because it describes the entire population
 Half are below the mean, 3.0
 Take mean plus or minus 3 standard deviations: between 2.7 and 3.3
 x>3.14 means z>1.4, look up proportion less than +1.4: .0808
 .09 below has z=1.34, so x=31.34(.1)=2.866

 number of persons
 (i) increase
 (ii) moderate
 (v) .63 (this could also be found by taking the square root of RSq=.396)
 (ii) stay the same; r is independent of units of measurement
 4.55+0.996(1)=5.546
 6.835.546=1.284
 (ii) Take the number of people and add 4.55
 2 (Number of persons=2 clearly has the most extreme residual, and it's
even singled out as an outlier in the output at the bottom of the page.
However, I only took off 2 points if you didn't notice the residual for 2,
and thought the one for 6 persons was the most extreme.)
 (ii) 10 people discarding 4 pounds (This is the only one of the three
that is way off the regression line, and the fact that its xvalue is far
from the rest could give it a lot of influence.)

 (ii) (In this design, a treatmentencouraging mothers to
breastfeedis imposed.)
 (i) 200 is the sample size, the number of individuals actually studied
 (ii) infants not participating are the control
 (ii) income/education is a possible lurking variable, which is tied
in with whether or not a baby is breastfed, and could also impact the
likelihood of infection. (i) is in fact the explanatory variable, (iii) is
the response.
 (ii) random assignment to treatment or control is essential; I
mentioned coinflipping in class as a viable way to make assignements
 (ii) The fact that mothers must know they are being encouraged to
breastfeed rules out the possibility of a doubleblind study. It would
certainly be possible to make it blind on the part of doctors: they wouldn't
have to know if a baby has been breastfed when they are diagnosing for
infection.

 4+4(.25)=5
 4(.05)=.2
[ Home
 Calendar
 Assignments
 Handouts
]
