# Issue

I'm using `read_csv`

to read CSV files into Pandas data frames. My CSV files contain large numbers of decimals/floats. The numbers are encoded using the European decimal notation:

```
1.234.456,78
```

This means that the '.' is used as the thousand separator and the ',' is the decimal mark.

Pandas 0.8. provides a `read_csv`

argument called 'thousands' to set the thousand separator. Is there an additional argument to provide the decimal mark as well? If no, what is the most efficient way to parse a European style decimal number?

Currently I'm using string replace which I consider to be a significant performance penalty. The coding I'm using is this:

```
# Convert to float data type and change decimal point from ',' to '.'
f = lambda x: string.replace(x, u',', u'.')
df['MyColumn'] = df['MyColumn'].map(f)
```

Any help is appreciated.

# Solution

You can use the `converters`

kw in `read_csv`

. Given `/tmp/data.csv`

like this:

```
"x","y"
"one","1.234,56"
"two","2.000,00"
```

you can do:

```
In [20]: pandas.read_csv('/tmp/data.csv', converters={'y': lambda x: float(x.replace('.','').replace(',','.'))})
Out[20]:
x y
0 one 1234.56
1 two 2000.00
```

Answered By - lbolla Answer Checked By - David Marino (PHPFixing Volunteer)

## 0 Comments:

## Post a Comment

Note: Only a member of this blog may post a comment.