Monday, November 28, 2022

[FIXED] How to sync color between Seaborn and pandas pie plot

Issue

I am struggling with syncing colors between [seaborn.countplot] and [pandas.DataFrame.plot] pie plot.

I found a similar question on SO, but it does not work with pie chart as it throws an error:

TypeError: pie() got an unexpected keyword argument 'color'

I searched on the documentation sites, but all I could find is that I can set a colormap and palette, which was also not in sync in the end: Result of using the same colormap and palette

My code:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

df = pd.read_csv('https://andybek.com/pandas-sat')
cat_vars = ['Borough', 'SAT Section']

for var in list(cat_vars):
    fig, ax = plt.subplots(1, 2, figsize=(10, 5))

    df[var].value_counts().plot(kind='pie', autopct=lambda v: f'{v:.2f}%', ax=ax[0])
    cplot = sns.countplot(data=df, x=var, ax=ax[1])

    for patch in cplot.patches:
        cplot.annotate(
            format(patch.get_height()),
            (
                patch.get_x() + patch.get_width() / 2,
                patch.get_height()
            )
        )
    plt.show()

Illustration of the problem

As you can see, colors are not in sync with labels.


Solution

I added the argument order to the sns.countplot(). This would change how seaborn selects the values and as a consequence the colours between both plots will mach.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

df = pd.read_csv('https://andybek.com/pandas-sat')
cat_vars = ['Borough', 'SAT Section']

for var in list(cat_vars):
    fig, ax = plt.subplots(1, 2, figsize=(10, 5))

    df[var].value_counts().plot(kind='pie', autopct=lambda v: f'{v:.2f}%', ax=ax[0])
    cplot = sns.countplot(data=df, x=var, ax=ax[1],
                         order=df[var].value_counts().index)

    for patch in cplot.patches:
        cplot.annotate(
            format(patch.get_height()),
            (
                patch.get_x() + patch.get_width() / 2,
                patch.get_height()
            )
        )

plt.show()

Explanation: Colors are selected by order. So, if the columns in the sns.countplot have a different order than the other plot, both plots will have different columns for the same label.



Answered By - Angel Bujalance
Answer Checked By - David Goodson (PHPFixing Volunteer)

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.