Issue
I have a dataframe (survey) in which i need to groupby 2 columns. One of the 2 columns is a ranking (5 options : Very Poor, Poor, Average, Good and Excellent) and the second one is a list of times. I need to groupby both of those columns like that :
raking | Time | Count of how many times the time appears on the column "time" for a raking
-------------------------------------
Very poor | 0.0 | 6
| 1.0 | 2
| 2.0 | 9
-------------------------------------
Poor | 0.0 | 3
| 1.0 | 12
...
I need to show the results of these table in 5 graphs (one for each raking), with x=Time and Y=Count
I've been stuck for a few hours now, can someone help???
Solution
Setup a MRE:
rank = ['Very Poor', 'Poor', 'Average', 'Good', 'Excellent']
df = pd.DataFrame({'Ranking': np.random.choice(rank, 100),
'Time': np.random.randint(1, 50, 100)})
print(df)
# Output:
Ranking Time
0 Excellent 28
1 Poor 33
2 Excellent 28
3 Average 22
4 Very Poor 11
.. ... ...
95 Very Poor 13
96 Average 26
97 Very Poor 23
98 Good 24
99 Good 36
[100 rows x 2 columns]
Use value_counts
to count (Ranking, Time) rather than groupby
:
count = df.value_counts(['Ranking', 'Time']).rename('Count').reset_index()
print(count)
# Output:
Ranking Time Count
0 Poor 41 3
1 Very Poor 46 3
2 Very Poor 49 2
3 Very Poor 17 2
4 Excellent 20 2
.. ... ... ...
81 Excellent 34 1
82 Excellent 32 1
83 Excellent 27 1
84 Excellent 26 1
85 Good 32 1
[86 rows x 3 columns]
To visualize data, the easiest way is to use seaborn
and displot
:
# Python env: pip install seaborn
# Anaconda env: conda install seaborn
import seaborn as sns
import matplotlib.pyplot as plt
sns.displot(df, x='Time', col='Ranking', binwidth=1)
plt.show()
Answered By - Corralien Answer Checked By - Gilberto Lyons (PHPFixing Admin)
0 Comments:
Post a Comment
Note: Only a member of this blog may post a comment.