Issue

I'm currently writing code using decimal.Decimal in python (v3.8.5).

I was wondering if anyone knows how the Decimal object is actually encoded.

I can't understand why the memory size is the same even if I change getcontext().prec, which is equal to change coefficients and exponent in decimal floating-points, as follows

from decimal import *
from sys import getsizeof

## coefficient bits = 3
getcontext().prec = 3

temp = Decimal('1')/Decimal('3')

print(temp.as_tuple()) >>> DecimalTuple(sign=0, digits=(3, 3, 3), exponent=-3)
print(getsizeof(temp)) >>> 104

## coefficient bits = 30
getcontext().prec = 30

temp = Decimal('1')/Decimal('3')

print(temp.as_tuple()) >>> DecimalTuple(sign=0, digits=(3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3), exponent=-30)
print(getsizeof(temp)) >>> 104

In order to understand the above behavior, I read the source code of Decimal class and the attached document.

According to the document, Python's Decimal object is implemented based on IEEE 754-2008, and the decimal digits of coefficient continuation are converted into binary digits using DPD (Densely packed decimal) encoding.

Therefore, according to the DPD algorithm, we can calculate the number of bits when the decimal digits of the coefficient continuation are encoded into binary digits.

And since the sign, exponent continuation, and combination field are simply expressed in binary, the number of bits when encoded can be easily calculated.

So, we can calculate the number of bits when Decimal obejcet is encoded by the following formula. bits = (sign) + (exp) + (comb) + (compressed coeff)

Here, sign and combination are fixed at 1bit and 5bits, respectively (according to the definition of IEEE 754-2008. https://en.wikipedia.org/wiki/Decimal_floating_point).

So, I wrote the above code to check the list of {sign, exponent, coefficient} using as_tuple() of the Decimal object, and calculate the actual number of bits in memory.

However, as mentioned above, the memory size of the Decimal object did not change at all, even though the number of digits in the coefficient should have changed. (I understand that a Decimal object is not only a decimal encoding but also a list and other objects.)

The following two questions arise.

(1) Am I wrong in my understanding of the encoding algorithm of the Decimal object in python? (Does python3.8.5 use a more efficient encoding algorithm than IEEE 754-2008?)

(2) Assuming that my understanding of the algorithm is correct, why does the memory size of the Decimal object remain the same even though the coefficient has been changed? (According to the definition of IEEE754-2008, when coefficient continuation is changed, exponent continuation is also changed, and total bits should be changed.)

I myself am a student who usually studies in the field of mechanical engineering, and I am a complete beginner in informatics. If there is any part of my original understanding that is wrong or if there is any strange logical development, please let me know.

I appreciate your help.

Solution

For sys.getsizeof:

Only the memory consumption directly attributed to the object is accounted for, not the memory consumption of objects it refers to.

Since Decimal is a Python class with references to several other objects (EDIT: see below), you just get the total size of the references, which is constant — not including the referred values, which are not.

getcontext().prec = 3
temp = Decimal(3) / Decimal(1)
print(sys.getsizeof(temp))
print(sys.getsizeof(temp._int))

getcontext().prec = 300
temp = Decimal(3) / Decimal(1)
print(sys.getsizeof(temp))         # same
print(sys.getsizeof(temp._int))    # not same

(Note that _int slot I used in the example is an internal implementation detail of CPython's Decimal, as hinted by the leading underscore; this code is not guaranteed to work in other Python implementations, or even in other versions.)

EDIT: Oops, my first answer was on an old Python, where Decimal is implemented in Python. The version you asked about has it implemented in C.

The C version actually stores everything inside the object itself, but your difference in precision was not sufficient to detect the difference (as memory is allocated in discrete chunks). Try it with getcontext().prec = 300 instead.

Answered By - Amadan

Answer Checked By - Gilberto Lyons (PHPFixing Admin)

Sunday, August 7, 2022

[FIXED] How exactly is a Decimal object encoded in python?

Issue

Solution

0 Comments:

Post a Comment

Total Pageviews

Featured Post

Why Learn PHP Programming

Sunday, August 7, 2022

Issue

Solution

0 Comments:

Post a Comment

Total Pageviews

Featured Post

Why Learn PHP Programming

Subscribe To