Clearly state the statistical model definition for this logistic regression model. Include any relevant assumptions.

STATISTICS 462 – Summer 2016 Homework 7

DUE Tuesday, August 9th

Unless otherwise stated, you can use R for any of the calculations, but make sure you include your code. Your code should not be a copy of anyone else’s! Any code you turn in should be well organized and commented so the grader can understand your answers.

All programming questions should be submitted to the dropbox on ANGEL for this assignment as a .pdf file using the naming convention HWNum_FirstInitialLastName.pdf. For example, John Doe would submit a file titled HW1_JDoe.pdf for the first assignment. Your answer to programming questions should include both code and a description of your result. I recommend using R-markdown for writing up your answers. A template for writing up an assignment in R-markdown can be found on ANGEL. R-markdown files can be compiled directly within R-Studio. Alternatively, answers may be saved in a word document or LaTeX, and converted into a .pdf file.

Non-coding questions can either be written and submitted in the same file as your coding questions using LaTeX typesetting (see https://latex-project.org/intro.html) or they may be handwritten and turned in separately during class.

1. Load the “wine.Rdata” dataset. This dataset contains the wine chemical and physical attributes for 1,599 red wines as well as a quality assessment (quality = 1: good, 0 = poor).

(a) Fit a logistic regression model with wine quality taken as the response, and the remaining variables as covariates.

(b) Clearly state the statistical model definition for this logistic regression model. Include any relevant assumptions.

(c) Interpret the model coefficients. What does this output indicate about the marginal association between each covariate and the mean response?

(d) Predict the probability that the response is of high quality under the following covariate settings.

Variable Level fixed acidity 10.7

volatile acidity 0.74 citric acid 0.52

residual sugar 3.6 chlorides 0.11

free sulfur dioxide 31 total sulfur dioxide 93.2

density 0.999 pH 3.5

sulphates 0.85 alcohol 12

(e) Use the predict() function to get predicted probabilities from the fitted model. Using the following decision rule, transform the predicted probabilities into pre- dicted response values. Create a confusion matrix for these predictions (i.e. true positives, false positives, true negatives, false negatives).

2. The board of directors of a professional association conducted a random sample survey of 30 members to assess the effects of several possible amounts of changes in membership dues. The predictor X denotes, in dollars, the change in annual dues from the previous year posited in the survey interview, and the response is binary: Y = 1 if the interviewee indicated that the membership will NOT be renewed at that amount of change in dues and Y = 0 if the membership will be renewed. The output for fitting the logistic regression model is given below. Use this to answer the following questions.

(a) Write the estimated equation as a function of X for

i. The log-odds of not renewing a membership ii. The odds of not renewing a membership iii. The probability of not renewing a membership