Issue
My dataframe looks something like this:
import pandas as pd
df = pd.read_sql('select * from foo')
a b c
0.1 0.2 0.3
0.3 0.4 0.5
If I directly run df['a'] * df['b']
the result is not exact as I expected because of float number issues.
I tried
import Decimal
df['a'].apply(Decimal) * df['b'].apply(Decimal)
But when I inspect df['a'].apply(Decimal) with PyCharm, the column turns out to be something strange, here is just an example, not real numbers:
a
0.09999999999999999
0.30000000000001231
I wonder how to do exact multiplication in pandas.
Solution
The problem is not in pandas but in floating point inaccuracy: decimal.Decimal(0.1)
is Decimal('0.1000000000000000055511151231257827021181583404541015625')
on my 64 bits system.
A simple trick would be to first change the floats to strings, because pandas knows enough about string conversion to properly round the values:
x = df['a'].astype(str).apply(Decimal) * df['b'].astype(str).apply(Decimal)
You will get a nice Series of Decimal:
>>> print(x.values)
[Decimal('0.02') Decimal('0.12')]
So with exact decimal operations - which can matters if you process monetary values...
Answered By - Serge Ballesta Answer Checked By - Willingham (PHPFixing Volunteer)
0 Comments:
Post a Comment
Note: Only a member of this blog may post a comment.