PHPFixing
  • Privacy Policy
  • TOS
  • Ask Question
  • Contact Us
  • Home
  • PHP
  • Programming
  • SQL Injection
  • Web3.0

Friday, May 13, 2022

[FIXED] How to append dataframe values to empty lists based on conditions

 May 13, 2022     append, loops, pandas, python     No comments   

Issue

I have the following dataframe:

dictionary = {'Monday': {'John': 5,
                  'Lisa': 1,
                  'Karyn': 'NaN',
                  'steve': 1,
                  'ryan': 4,
                  'chris': 5,
                  'jessie': 6},
         'Friday': {'John': 0,
                  'Lisa': 1,
                  'Karyn':'NaN',
                  'steve': 4,
                  'ryan': 7,
                  'chris': 'NaN',
                   'jessie': 11},
        'Saturday': {'John': 0,
                  'Lisa': 1,
                  'Karyn': 2,
                  'steve': 4,
                   'ryan': 'NaN',
                   'chris': 'NaN',
                   'jessie': 1}}
                     
tab = pd.DataFrame(dictionary)
      Monday    Friday  Saturday
John    5          0    0
Lisa    1          1    1
Karyn   NaN       NaN   2
steve   1          4    4
ryan    4          7    NaN
chris   5        NaN    NaN
jessie  6         11    1

I have these empty lists

mon_only = []
fri_only = []
sat_only = []
mon_fri_only = []
mon_sat_only = []
fri_sat_only = []
mon_fri_sat = []

I would like to append the index to these lists based on where the fall. For example if an index name has a value greater than zero then its considered present in that column. If its present in only one Monday column then its go to the mon_only list. If its present in all three columns then it'll go in the mon_fri_sat list.

The results should essentially look like this

mon_only = ['John','chris']
fri_only = []
sat_only = ['Karyn']
mon_fri_only = ['ryan']
mon_sat_only = []
fri_sat_only = []
mon_fri_sat = ['Lisa','steve','jessie']

Solution

You can use itertools.combinations to first create the combinations, then using a condition and df.dot , get column names where value is not 'NaN' or 0. Finally reindex and fill nan with []

from itertools import combinations
from collections import defaultdict

delim = ","
c = ~(tab.eq("NaN")|tab.eq(0))
d = c.dot(c.columns+delim).str.rstrip(delim)

ind = [delim.join(idx) for i in range(1,len(tab.columns)+1) 
       for idx in list(combinations(tab.columns,i))]
defd = defaultdict(list)
for k,v in d.items():
    if v not in defd[v]:
        defd[v].append(k)
    
out_d = pd.Series(defd).reindex(ind,fill_value=[]).to_dict()

Output:

print(out_d)

{'Monday': ['John', 'chris'],
 'Friday': [],
 'Saturday': ['Karyn'],
 'Monday,Friday': ['ryan'],
 'Monday,Saturday': [],
 'Friday,Saturday': [],
 'Monday,Friday,Saturday': ['Lisa', 'steve', 'jessie']}

Save this dictionary in a variable and slice by keys to get to your desired output.


If combinations doesnot matter, then same code but smaller:

from collections import defaultdict
defd = defaultdict(list)
c = ~(tab.eq("NaN")|tab.eq(0))
d = c.dot(c.columns+',').str.rstrip(",")
for k,v in d.items():
    if v not in defd[v]:
        defd[v].append(k)
    

print(defd)
defaultdict(list,
            {'Monday': ['John', 'chris'],
             'Monday,Friday,Saturday': ['Lisa', 'steve', 'jessie'],
             'Saturday': ['Karyn'],
             'Monday,Friday': ['ryan']})


Answered By - anky
Answer Checked By - Candace Johnson (PHPFixing Volunteer)
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg
Newer Post Older Post Home

0 Comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Total Pageviews

Featured Post

Why Learn PHP Programming

Why Learn PHP Programming A widely-used open source scripting language PHP is one of the most popular programming languages in the world. It...

Subscribe To

Posts
Atom
Posts
Comments
Atom
Comments

Copyright © PHPFixing