In this article, I will go over various evaluation metrics available for a regression model. I will also go over the advantages and disadvantages of all the various metrics. Please note, this article isn’t about the in-depth mathematics behind these metrics, instead, it will focus more on the application side of these metrics.
The evaluation metrics which we will cover are:
- R Squared / Adjusted R Squared
- Mean Squared Error (MSE) / Root Mean Squared Error (RMSE)
- Mean Absolute Error (MAE)
Preparing our data
Before we jump into the metrics, let’s quickly import our data, do a little cleanup and fit the data into a linear regression model. We will then use that model’s performance and evaluate using the mentioned metrics.
# Importing Libraries
from sklearn.datasets import fetch_california_housing
# Importing Data
X, y = fetch_california_housing(as_frame=True, return_X_y=True)
# Splitting the data
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42, test_size=0.2)
# Pre-processing the data
from sklearn.preprocessing import StandardScaler
from sklearn.impute import SimpleImputer
from sklearn.pipeline import Pipeline
# Creating pre-process pipeline
preprocessing = Pipeline(steps=[
('impute', SimpleImputer()),
('scale', StandardScaler()),
])
X_train_processed = preprocessing.fit_transform(X_train)
# Simple Linear Regression
from sklearn.linear_model import LinearRegression
lin_reg = LinearRegression()
lin_reg.fit(X_train_processed, y_train)
# Predictions
y_pred = lin_reg.predict(X_train_processed)
The above code:
- Fetches the data
- Splits the data into training and test set
- Pre-processes the data
- Fits the data into a linear regression model
- Predicts the result using the linear regression model
Please Note: This article focuses only on the evaluation metrics and not on the above topics. At the time of writing this article, I do have an article about splitting the data, a link to that article can be found here.
Evaluating the model
R Square/Adjusted R Square
- R Square measures how much of variability in dependent variable can be explained by the model It’s the square of the correlation coefficient
- It’s a good measure to determine how well the model fits the dependent variables
- It doesn’t take into consideration of overfitting problem
- Best possible score is 1
- Adjusted R Square penalises additional independent variables added to the model and adjust the metric to prevent the overfitting issue
from sklearn.metrics import r2_score
predicted_r2_score = r2_score(y_train, y_pred)
print(f'R2 score of predicted values: {predicted_r2_score}')
Output:
R2 score of predicted values: 0.6125511913966952
As we can see 61% of the dependent variability can be explained by the model
Mean Squared Error (MSE) / Root Mean Squared Error (RMSE)
- Mean Squared Error
- It is the sum of square of prediction error (which is real output minus the predicted output) divided by the number of data points
- It gives an absolute number on how much the predicted result deviate from the actual value
- It doesn’t provide much insights but is a good metric to compare different models
- It gives larger penalisation to big prediction error
- Root Mean Squared Error
- It’s the root of MSE
- More commonly used than MSE
from sklearn.metrics import mean_squared_error
predicted_mse = mean_squared_error(y_train, y_pred)
print(f'Predicted MSE: {predicted_mse}')
predicted_rmse = np.sqrt(predicted_mse)
print(f'Predicted RMSE: {predicted_rmse}')
Output:
Predicted MSE: 0.5179331255246699
Predicted RMSE: 0.7196757085831575
Mean Absolute Error (MAE)
- It is similar to MSE. The only difference is that instead of taking the sum of square of error (like in MSE), it takes the sum of absolute value of error
- Compared to MSE or RMSE, it is more direct representation of sum of error terms
- It treats all the errors the same
from sklearn.metrics import mean_absolute_error
predicted_mae = mean_absolute_error(y_train, y_pred)
print(f'Predicted MAE: {predicted_mae}')
Output:
Predicted MAE: 0.5286283596581934
Conclusion
There are other evaluating metrics also like explained variance, max error, root mean squared log error, and so on. I have only discussed the above since these are the most common ones and are widely used.