Issue
I have a dataframe like this:
df:
Score
group
A 100
A 34
A 40
A 30
C 24
C 60
C 35
For every group in the data, I want to find out the percentile value of Score 35. (i.e the percentile where the 35 fits in the grouped data)
I tried different tricks but none of them worked.
scipy.stats.percentileofscore(df['Score], 35, kind='weak')
--> This is working but this doesn't give me the percentile grouped by index
df.groupby('group')['Score].percentileofscore()
--> 'SeriesGroupBy' object has no attribute 'percentileofscore'
scipy.stats.percentileofscore(df.groupby('group')[['Score]], 35, kind='strict')
--> TypeError: '<' not supported between instances of 'str' and 'int'
My ideal output looks like this:
df:
Score Percentile
group
A 50
C 33
Can anyone suggest to me what works well here?
Solution
Inverse quantile function for a sequence at point X is the proportion of values less than X in the sequence, right? So:
In [158]: df["Score"].lt(35).groupby(df["group"]).mean().mul(100)
Out[158]:
group
A 50.000000
C 33.333333
Name: Score, dtype: float64
- get a True/False Series of whether < 35 or not on "Score"
- group this Series over "group"
- take the mean
- since True == 1 and False == 0, it will effectively give the proportion!
mul
tiply by 100 to get percentages
Answered By - Mustafa Aydın Answer Checked By - Robin (PHPFixing Admin)
0 Comments:
Post a Comment
Note: Only a member of this blog may post a comment.