

Mohit Arora
Happy Learning !!!
How to Predict using the model ?
Grades.csv consists of 105 observations and 22 variables (mostly categorical or nominal ).The data tells about the details of the students giving gender, ethnicity, class sections, marks in 5 quizzes, marks in final, total marks, GPA and percentage and also tell whether the student is pass or fail.
As we the main motive of the project was to build a predictive model for predicting final for consumption I have built two linear regression models to predict final.
Model to predict final using quiz2 & quiz3
One of our Linear Regression Model predicts the final score using the performance in quiz2 & quiz3 combined. The name of this model is f23 This multiple regression model helps us explain the variance in the final with 36.71% confidence and the remaining 63.29% is due to the other factors like quiz1 quiz2 quiz3 quiz4 quiz5 GPA or total etc. The model predictors have a significant slope with the dependent or the response variable final which is checked by the significance value for t-test of the predictors i.e the p-value. Moreover, the Variance Inflation factor for both the variables is 1.89 which is in acceptance zone meaning these two predictors are not highly correlated.Also, the DWS for the model is 2.233 which lies in the no autocorrelation zone meaning our linear regression is a right model for predicting. The standard error of estimate of our model is 6.381 means that with this much standard deviation range we would be able to predict our final range with 95% confidence level. The equation of the model of predicting final using quiz2 & quiz3 with 36.71% variance in final explained is given as below
:
final=39.7129 + (1.5407×quiz2)+ (1.1862×quiz3)
The f-test significance value i.e p-value for f-test is less than LOS meaning that the test of an overall model is significant with the final dependent variable. The assumptions of our models like normality, Linear relationship, constant error variance are almost satisfied with the data provided and calculated .so we can say that our Model I right way to predict final using quiz2 & quiz3 scores.
Model for predicting final using the variables total and quiz 3
The name of the model is ft3..This multiple regression model helps us explain the variance in the final with 87.31% confidence and the remaining 12.69% is due to the other factors like quiz1 quiz2 quiz3 quiz4 quiz5 GPA alone or grouped together. The model predictors have a significant slope with the dependent or the response variable final which is checked by the significance value for t-test of the predictors i.e the p-value of t-test for the model. the p-value for the t-test is 2e-16 and for quiz3 it is 6.192-14. The equation for the model ft3 is given by :
final =6.67358 + (.69502×total) - (1.89612×quiz3)
Variance Inflation factor for both the variables is 3.21 which is in acceptance zone meaning these two predictors are not highly correlated. Also, the DWS for the model is 2.113 which lies in the no autocorrelation zone meaning our linear regression is a right model for predicting.The standard error of estimate of our model is 2.857 means that with this much standard deviation range we would be able to predict our final range with a particular LOS defined. The f-test significance value i.e p-value of -f-test is less than LOS meaning that the test of an overall model is significant with the final dependent variable. The assumptions of our models like normality, Linear relationship, constant error variance are very well satisfied with the data provided and calculated .so we can say that our Model I right way to predict final using quiz2 & quiz3 scores
This model can be used by the principal if she doesn’t have a final score with her. and she already has a total score with her. Or also this model is used to predict a final score using the predictor's combination of total and quiz3 with 87.31% variance in the model. The assumptions of our models like normality, Linear relationship, constant error variance are almost satisfied with the data provided and calculated .so we can say that our Model I right way to predict final using quiz2 & quiz3 scores
​​
As we can understand these predicted final values are point estimates, so we need to find out the range/intervals of final that will help us predict the scores to a better extent keeping some deviations in mind.These are called confidence intervals. They are calculated using the standard error of the estimate. For our model which uses total and quiz3 for predicting final sore has a standard error of estimate as 2.857 & the model which uses quiz2 and quiz3 for predicting final sore has a standard error of estimate as 6.381
​
For example: If we want to predict a confidence interval for a predicted score of 59.11 which is obtained using a the model, we will use the formula for model named ft3

This is as simple explanations to predict the final scores based on the best models which can be built using predictor variables available. The Model named ft3 which uses total as a predictor can help us predict final only if either total score is available with or total sore is assumed and based on it the final score obtained by model can be predicted .