Question 9 of our Applied Regression final exam (and solution to question 8)

Here’s question 9 of our exam:

9. We downloaded data with weight (in pounds) and age (in years) from a random sample of American adults. We created a new variables, age10 = age/10. We then fit a regression:

lm(formula = weight ~ age10)
(Intercept)    161.0     7.3
age10            2.6     1.6
  n = 2009, k = 2
  residual sd = 119.7, R-Squared = 0.00

Make a graph of weight versus age (that is, weight in pounds on y-axis, age in years on x-axis). Label the axes appropriately, draw the fitted regression line, and make a scatterplot of a bunch of points consistent with the information given and with ages ranging roughly uniformly between 18 and 90.

And the solution to question 8:

8. Out of a random sample of 50 Americans, zero report having ever held political office. From this information, give a 95% confidence interval for the proportion of Americans who have ever held political office.

This is a job for the Agresti-Coull interval. y* = y + 2, n* = n + 4, p* = y*/n* = 2/54 = 0.037, with standard error sqrt(p*(1-p*)/n*) = sqrt((2/54)*(52/54)/54) = 0.026. Estimate is [p* +/- 2se] = [-0.014, 0.088], but the probability can’t be negative, so [0, 0.088] or simply [0, 0.09].

Common mistakes

Most of the students remembered the Agresti-Coull interval, but some made the mistake of giving confidence intervals that excluded zero (which can’t be right, given that the data are 0/50) or that included negative values.