Consider the crime data stored in crime.csv. We would like to understand how mur
Consider the crime data stored in crime.csv. We would like to understand how murder rate is
related to the other variables in the dataset. Note that State is the ”subject” here; it’s not a
predictor, and region is a qualitative variable. You can remove State column from dataset by using
the following command: crime.data <- crime[, -1].
(a) Build a linear regression model murder rate vs single.parent and comment on your model.
(b) Build a multiple linear regression model to predict murder rate based on the other variables.
The model should have all the important variables and it should not have any unimportant
variables. Be sure to explore the interactions as well. Perform model diagnostics to check the
standard model assumptions and perform any transformations needed to obtain a model for
which the assumptions reasonably hold.
(c) Use your final model to predict murder rate.
Introduction
Discuss the statement of the problem in terms of the statistical analyses that are being performed. Be
sure to address the following:
What is the data set that you are exploring?
How will your results be used?
What type of analyses will you be running in this project?
Answer the questions in a paragraph response. Remove all questions and this note before
submitting! Do not include R code in your report.
2. Data Preparation
1. What are the variables stored in crime.csv? ( use the following functions)
crime <- read.csv("crime.csv")
str(crime))
2. Remove the State variable (qualitative)
remove State variable
crime.data <- crime[, -1]
pairs(crime.data)
3. Explain the correlation coefficients between variables (use cor(crime.data[, -8]))
Answer the questions in a paragraph response. Remove all questions and this note before
submitting! Do not include R code in your report.
3. Simple Linear Regression: Create a simple linear regression model to predict the murder rate using
single.parent.
In general, how is a simple linear regression model used to predict the response variable using
the predictor variable?
What is the equation for your model?
a. Report the P-value in a formatted table as shown below:
Statistic Value
P-value X.XXXX
*Round off to 4 decimal places.
What is the predicted murder rate for a state that has single.parent of 29? Round your answer
down to the nearest integer.
What is the predicted murder rate for a state that has single.paren of 25.4? Round your answer
down to the nearest integer.
Answer the questions in a paragraph response. Remove all questions and this note (but not the
table) before submitting! Do not include R code in your report.
4. Multiple Regression: Create a multiple linear regression model to predict the murder rate
using all important variables.
In general, how is a multiple linear regression model used to predict the response variable using
predictor variables?
Report the P-value in a formatted table as shown below:
Statistic Value
P-value X.XXXX
*Round off to 4 decimal places.
Based on the results of the overall p- value, is at least one of the predictors statistically
significant in predicting the murder rate?
List all the important variables to predict the murder rate based on p-value.
Report and interpret the coefficient of determination.
What is the equation for your model?
What is the predicted murder rate based on your final multiple regression model (assume some
random values for your variables)?
Answer the questions in a paragraph response. Remove all questions and this note (but not the
table) before submitting! Do not include R code in your report.
5. Conclusion
Describe the results of the statistical analyses clearly, using proper descriptions of statistical terms and
concepts. Fully describe what these results mean for your scenario.
Briefly summarize your findings in plain language.
What is the practical importance of the analyses that were performed?
What is the predicted murder rate for a state that has single.parent of 29? Round your answer
down to the nearest integer.
What is the predicted murder rate for a state that has single.paren of 25.4? Round your answer
down to the nearest integer.
Answer the questions in a paragraph response. Remove all questions and this note (but not the
table) before submitting! Do not include R code in your report.
4. Multiple Regression: Create a multiple linear regression model to predict the murder rate
using all important variables.
In general, how is a multiple linear regression model used to predict the response variable using
predictor variables?
Report the P-value in a formatted table as shown below:
Statistic Value
P-value X.XXXX
*Round off to 4 decimal places.
Based on the results of the overall p- value, is at least one of the predictors statistically
significant in predicting the murder rate?
List all the important variables to predict the murder rate based on p-value.
Report and interpret the coefficient of determination.
What is the equation for your model?
What is the predicted murder rate based on your final multiple regression model (assume some
random values for your variables)?
Answer the questions in a paragraph response. Remove all questions and this note (but not the
table) before submitting! Do not include R code in your report.
5. Conclusion
Describe the results of the statistical analyses clearly, using proper descriptions of statistical terms and
concepts. Fully describe what these results mean for your scenario.
Briefly summarize your findings in plain language.
What is the practical importance of the analyses that were performed