PHPFixing
  • Privacy Policy
  • TOS
  • Ask Question
  • Contact Us
  • Home
  • PHP
  • Programming
  • SQL Injection
  • Web3.0

Wednesday, November 2, 2022

[FIXED] How to increase speed on finding an index?

 November 02, 2022     indexing, pandas, python, subset     No comments   

Issue

I have two dataframes: df2

Time(s) vehicle_id
10          1
11          1
12          1
13          1
...         ...
78          2
79          2
80          2

And com:

index    v1    v2
  1       1    2

I would like to find out the index of the row which matches the following condition:

if (abs(max(df2.loc[df2['vehicle_id'] == com['v1'][i], 'time(s)']) - max(df2.loc[df2['vehicle_id'] == com['v2'][i], 'time(s)']))>60.0):
    arr.append(i)

In this case, the condition is true since the max time from v1 is 13 and from v2 is 80, then, 13-80=-67=67 It works, however, it has to compare against more than 40M rows and it is taking about 200k rows/hour, so, it will take hours. Is there a faster way to proceed? I just need to obtain the indexes from com which matches the above condition.

Update: I check all the condition positions using a FOR loop

for i in range(0, len(com)):
if (abs(max(df2.loc[df2['vehicle_id'] == com['v1'][i], 'time(s)']) - max(df2.loc[df2['vehicle_id'] == com['v2'][i], 'time(s)']))>60.0):
        arr.append(i)

Solution

It looks like you only need to consider the max time for each id, so why not sort then drop duplicates to keep only the max time per vehicle id. Then you can replace the vehicle ids in your comm df w/the times and perform your filter conditions on the entire result, saving the resulting indexes to an array.

import pandas as pd

df2 = pd.DataFrame({'Time(s)': [10, 11, 12, 13, 78, 79, 80], 'vehicle_id': [1, 1, 1, 1, 2, 2, 2]})
comm = pd.DataFrame({'index': [1], 'v1': [1], 'v2': [2]})

m = df.sort_values(by='Time(s)').drop_duplicates(subset='vehicle_id', keep='last')
comm = comm.set_index('index').replace(m.set_index('vehicle_id')['Time(s)'].to_dict())
arr = comm.loc[(comm['v1']-comm['v2']).abs().gt(60)].index.values

print(arr)


Answered By - Chris
Answer Checked By - Clifford M. (PHPFixing Volunteer)
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg
Newer Post Older Post Home

0 Comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Total Pageviews

Featured Post

Why Learn PHP Programming

Why Learn PHP Programming A widely-used open source scripting language PHP is one of the most popular programming languages in the world. It...

Subscribe To

Posts
Atom
Posts
Comments
Atom
Comments

Copyright © PHPFixing