Friday, October 7, 2022

[FIXED] How sklearn.metrics.r2_score works

Issue

I tried to implement formula from Wikipedia but results are different. Why is it so?

y_true = np.array([1, 1, 0])
y_pred = np.array([1, 0, 1])

r2 = r2_score(y_true, y_pred)
print(r2)

y_true_mean = statistics.mean(y_true)
r2 = 1 - np.sum((y_true - y_pred) ** 2) / np.sum((y_true - y_true_mean) ** 2)
print(r2)

-1.9999999999999996
0.0


Solution

Not sure what statistics package you use, but it seems that the different outcome originates there. Try to use np.mean instead. That gives the same R2 as sklearn:

import numpy as np

y_true = np.array([1, 1, 0])
y_pred = np.array([1, 0, 1])

y_true_mean = np.mean(y_true)
r2 = 1 - np.sum((y_true - y_pred) ** 2) / np.sum((y_true - y_true_mean) ** 2)
print(r2)

Try it online!



Answered By - agtoever
Answer Checked By - Dawn Plyler (PHPFixing Volunteer)

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.