PHPFixing
  • Privacy Policy
  • TOS
  • Ask Question
  • Contact Us
  • Home
  • PHP
  • Programming
  • SQL Injection
  • Web3.0

Saturday, October 29, 2022

[FIXED] How to apply filter condition during Pandas Merge operation but retain them in the result DF?

 October 29, 2022     conditional-statements, left-join, pandas, python     No comments   

Issue

I'm trying to do something with pandas equivalent to the following SQL command:

select *
from tableA as A
left join tableB as B
on A.key = B.key and substr(A.value,1,2) not in ('something', 'something else')

As you can see in the sql query above, A.value have a filter condition applied during the join, but since it's a left join, the on condition not in criteria does not persist to the result dataset, it only applies during the join operation, such that the join operation will not consider any rows fulfilling the A.value not in ('something', 'something else') criteria.

I tried the following python code, but it results in filtering out the A.value rows from the result dataset, I need all the rows via the left join, but i don't want the merge operation on the rows as specified by the a.value not in condition, just like the SQL statement logic above.

df_output = df_input1[~df_input1['value'].str[:2].isin(['something', 'something_else'])].merge(df_input2, left_on = 'key', right_on = 'key', how='left')

Thank you all in advance!


Solution

A bit hacky, but the idea is to create a new column for joining in df1 and to only fill in this column for rows that you want to merge with:

import pandas as pd

df_input1 = pd.DataFrame({
    'fruit':['apple','banana','mango'],
    'quantity':[3,4,5],
})

df_input2 = pd.DataFrame({
    'fruit':['apple','banana','mango'],
    'in_stock':[True,False,True],
})

minimum_quantity = 4
ind = df_input1['quantity'].ge(minimum_quantity)
df_input1.loc[ind, 'merge_column'] = df_input1.loc[ind, 'fruit']

merge_filtered_fruits = df_input1.merge(
    df_input2,
    left_on = 'merge_column',
    right_on = 'fruit',
    how = 'left'
).drop(columns=['merge_column'])

print(merge_filtered_fruits)

Output:

  fruit_x  quantity fruit_y in_stock
0   apple         3     NaN      NaN
1  banana         4  banana    False
2   mango         5   mango     True

I'm curious what a better method could be



Answered By - mitoRibo
Answer Checked By - Dawn Plyler (PHPFixing Volunteer)
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg
Newer Post Older Post Home

0 Comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Total Pageviews

Featured Post

Why Learn PHP Programming

Why Learn PHP Programming A widely-used open source scripting language PHP is one of the most popular programming languages in the world. It...

Subscribe To

Posts
Atom
Posts
Comments
Atom
Comments

Copyright © PHPFixing