Issue
So, as seen in the dataframe, there's 3 races. I want to find the time difference between 1st and second place for each race, then the output would be the average that each runner would win each race by.
import pandas as pd
# initialise data of lists.
data = {'Name':['A', 'B', 'B', 'C', 'A', 'C'], 'RaceNumber':
[1, 1, 2, 2, 3, 3], 'PlaceWon':['First', 'Second', 'First', 'Second', 'First', 'Second'], 'TimeRanInSec':[100, 98, 66, 60, 75, 70]}
# Create DataFrame
df = pd.DataFrame(data)
# Print the output.
print(df)
In this case, The output would be a data frame that outputs A won races by an average of 3.5 sec. B won by an average of 6 sec.
I imagine this could be done by grouping by RaceNumber and then subtracting TimeRanInSec. But unsure how to get the average of each Name.
Solution
I think you need two groupby operations, one to get the winning margin for each race, and then one to get the average winning margin for each person.
For a general solution, I would first define a function that calculates the winning margin from a list of times (for one race). Then you can apply that function to the times in each race group and join the resulting winning margins to the dataframe of all the winners. Then it's easy to get the desired averages:
def winning_margin(times):
times = list(times)
winner = min(times)
times.remove(winner)
return min(times) - winner
winning_margins = df[['RaceNumber', 'TimeRanInSec']] \
.groupby('RaceNumber').agg(winning_margin)
winning_margins.columns = ['margin']
winners = df.loc[df.PlaceWon == 'First', :]
winners = winners.join(winning_margins, on='RaceNumber')
avg_margins = winners[['Name', 'margin']].groupby('Name').mean()
avg_margins
margin
Name
A 3.5
B 6.0
Answered By - Arne Answer Checked By - Candace Johnson (PHPFixing Volunteer)
0 Comments:
Post a Comment
Note: Only a member of this blog may post a comment.