PHPFixing
  • Privacy Policy
  • TOS
  • Ask Question
  • Contact Us
  • Home
  • PHP
  • Programming
  • SQL Injection
  • Web3.0

Thursday, August 11, 2022

[FIXED] How to efficiently handle European decimal separators using the pandas read_csv function?

 August 11, 2022     csv, decimal, pandas, python     No comments   

Issue

I'm using read_csv to read CSV files into Pandas data frames. My CSV files contain large numbers of decimals/floats. The numbers are encoded using the European decimal notation:

1.234.456,78

This means that the '.' is used as the thousand separator and the ',' is the decimal mark.

Pandas 0.8. provides a read_csv argument called 'thousands' to set the thousand separator. Is there an additional argument to provide the decimal mark as well? If no, what is the most efficient way to parse a European style decimal number?

Currently I'm using string replace which I consider to be a significant performance penalty. The coding I'm using is this:

# Convert to float data type and change decimal point from ',' to '.'
f = lambda x: string.replace(x, u',', u'.')
df['MyColumn'] = df['MyColumn'].map(f)

Any help is appreciated.


Solution

You can use the converters kw in read_csv. Given /tmp/data.csv like this:

"x","y"                                                                         
"one","1.234,56"                                                                
"two","2.000,00"   

you can do:

In [20]: pandas.read_csv('/tmp/data.csv', converters={'y': lambda x: float(x.replace('.','').replace(',','.'))})
Out[20]: 
     x        y
0  one  1234.56
1  two  2000.00


Answered By - lbolla
Answer Checked By - David Marino (PHPFixing Volunteer)
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg
Newer Post Older Post Home

0 Comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Total Pageviews

Featured Post

Why Learn PHP Programming

Why Learn PHP Programming A widely-used open source scripting language PHP is one of the most popular programming languages in the world. It...

Subscribe To

Posts
Atom
Posts
Comments
Atom
Comments

Copyright © PHPFixing