Issue

According the answer to this post,

The most classic "correlation" measure between a nominal and an interval ("numeric") variable is Eta, also called correlation ratio, and equal to the root R-square of the one-way ANOVA (with p-value = that of the ANOVA). Eta can be seen as a symmetric association measure, like correlation, because Eta of ANOVA (with the nominal as independent, numeric as dependent) is equal to Pillai's trace of multivariate regression (with the numeric as independent, set of dummy variables corresponding to the nominal as dependent).

I would appreciate if you could let me know how to compute Eta in python.

In fact, I have a dataframe with some numeric and some nominal variables.

Besides, how to plot a heatmap like plot for it?

Solution

The answer is provided here:

def correlation_ratio(categories, measurements):
        fcat, _ = pd.factorize(categories)
        cat_num = np.max(fcat)+1
        y_avg_array = np.zeros(cat_num)
        n_array = np.zeros(cat_num)
        for i in range(0,cat_num):
            cat_measures = measurements[np.argwhere(fcat == i).flatten()]
            n_array[i] = len(cat_measures)
            y_avg_array[i] = np.average(cat_measures)
        y_total_avg = np.sum(np.multiply(y_avg_array,n_array))/np.sum(n_array)
        numerator = np.sum(np.multiply(n_array,np.power(np.subtract(y_avg_array,y_total_avg),2)))
        denominator = np.sum(np.power(np.subtract(measurements,y_total_avg),2))
        if numerator == 0:
            eta = 0.0
        else:
            eta = numerator/denominator
        return eta

Answered By - User777

Answer Checked By - Robin (PHPFixing Admin)

Friday, October 7, 2022

[FIXED] How to compute correlation ratio or Eta in Python?

Issue

Solution

0 Comments:

Post a Comment

Total Pageviews

Featured Post

Why Learn PHP Programming

Friday, October 7, 2022

Issue

Solution

0 Comments:

Post a Comment

Total Pageviews

Featured Post

Why Learn PHP Programming

Subscribe To