PHPFixing
  • Privacy Policy
  • TOS
  • Ask Question
  • Contact Us
  • Home
  • PHP
  • Programming
  • SQL Injection
  • Web3.0

Friday, May 13, 2022

[FIXED] How to add rows as sums of other rows in DataFrame?

 May 13, 2022     aggregate, append, dataframe, pandas, python     No comments   

Issue

I'm not sure I titled this post correctly but I have a unique situation where I want to append a new set of rows to an existing DataFrame as a sum of rows from existing sets and I'm not sure where to start.

For example, I have the following DataFrame:

import pandas as pd

data = {'Team': ['Atlanta', 'Atlanta', 'Cleveland', 'Cleveland'],
        'Position': ['Defense', 'Kicker', 'Defense', 'Kicker'],
        'Points': [5, 10, 15, 20]}

df = pd.DataFrame(data)

print(df)

        Team Position  Points
0    Atlanta  Defense       5
1    Atlanta   Kicker      10
2  Cleveland  Defense      15
3  Cleveland   Kicker      20

How do I create/append new rows which create a new position for each team and sum the points of the two existing positions for each team? Additionally, the full dataset consists of several more teams so I'm looking for a solution that will work for any number of teams.

edit: I forgot to include that there are other positions in the complete DataFrame; but, I only want this solution to be applied for the positions "Defense" and "Kicker".

My desired output is below.

            Team Position  Points
        Team          Position  Points
0    Atlanta           Defense       5
1    Atlanta            Kicker      10
2  Cleveland           Defense      15
3  Cleveland            Kicker      20
4    Atlanta  Defense + Kicker      15
5  Cleveland  Defense + Kicker      35

Thanks in advance!


Solution

We can use groupby agg to create the summary rows then append to the DataFrame:

df = df.append(df.groupby('Team', as_index=False).agg({
    'Position': ' + '.join,  # Concat Strings together
    'Points': 'sum'  # Total Points
}), ignore_index=True)

df:

        Team          Position  Points
0    Atlanta           Defense       5
1    Atlanta            Kicker      10
2  Cleveland           Defense      15
3  Cleveland            Kicker      20
4    Atlanta  Defense + Kicker      15
5  Cleveland  Defense + Kicker      35

We can also whitelist certain positions by filtering df before groupby to aggregate only the desired positions:

whitelisted_positions = ['Kicker', 'Defense']
df = df.append(
    df[df['Position'].isin(whitelisted_positions)]
        .groupby('Team', as_index=False).agg({
        'Position': ' + '.join,  # Concat Strings together
        'Points': 'sum'  # Total Points
    }), ignore_index=True
)


Answered By - Henry Ecker
Answer Checked By - Katrina (PHPFixing Volunteer)
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg
Newer Post Older Post Home

0 Comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Total Pageviews

Featured Post

Why Learn PHP Programming

Why Learn PHP Programming A widely-used open source scripting language PHP is one of the most popular programming languages in the world. It...

Subscribe To

Posts
Atom
Posts
Comments
Atom
Comments

Copyright © PHPFixing