PHPFixing
  • Privacy Policy
  • TOS
  • Ask Question
  • Contact Us
  • Home
  • PHP
  • Programming
  • SQL Injection
  • Web3.0

Thursday, May 12, 2022

[FIXED] How to concat thousands of pandas dataframes generated by a for loop efficiently?

 May 12, 2022     append, dataframe, pandas, python     No comments   

Issue

Thousands of dfs of consistent columns are being generated in a for loop reading different files, and I'm trying to merge / concat / append them into a single df, combined:

combined = pd.DataFrame()

for i in range(1,1000): # demo only
    global combined
    generate_df() # df is created here
    combined = pd.concat([combined, df])

This is initially fast but slows as combined grows, eventually becoming unusably slow. This answer on how to append rows explains how adding rows to a dict and then creating a df is most efficient but I can't figure out how to do that with to_dict.

What's a good way to to this? Am I approaching this the wrong way?


Solution

You can create list of DataFrames and then use concat only once:

dfs = []

for i in range(1,1000): # demo only
    global combined
    generate_df() # df is created here
    dfs.append(df)

combined = pd.concat(dfs)


Answered By - jezrael
Answer Checked By - David Goodson (PHPFixing Volunteer)
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg
Newer Post Older Post Home

0 Comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Total Pageviews

Featured Post

Why Learn PHP Programming

Why Learn PHP Programming A widely-used open source scripting language PHP is one of the most popular programming languages in the world. It...

Subscribe To

Posts
Atom
Posts
Comments
Atom
Comments

Copyright © PHPFixing