Issue
I was playing around with sys.getrefcount
in Python 3.7 on Windows. I tried the following:
>>> import sys
>>> x = "this is an arbitrary string"
>>> sys.getrefcount(x)
2
I understand that one of the references is x
, and the other is the parameter used internally within sys.getrefcount
. This seems to work no matter the type to which x
is initialized. However, I noticed some weird behavior when I don't assign before I pass:
>>> import sys
>>> sys.getrefcount("arbitrary string")
2
>>> sys.getrefcount(1122334455)
2
>>> sys.getrefcount(1122334455+1)
2
>>> sys.getrefcount(frozenset())
2
>>> sys.getrefcount(set())
1
>>> sys.getrefcount(object())
1
>>> sys.getrefcount([])
1
>>> sys.getrefcount(lambda x: x)
1
>>> sys.getrefcount(range(1122334455))
1
>>> sys.getrefcount(dict())
1
>>> sys.getrefcount(())
8341
>>> sys.getrefcount(tuple())
8340
>>> sys.getrefcount(list("arbitrary string"))
1
>>> sys.getrefcount(tuple("arbitrary string"))
1
>>> sys.getrefcount(("a", "r", "b", "i", "t", "r", "a", "r", "y", " ", "s", "t", "r", "i", "n", "g"))
2
What is going on here? It seems that immutable types have two references but mutable types have only one? Why does it seem that some objects assigned before being passed, while others only ever have a reference as a parameter?
Does this have something to do with str
/int
/tuple
internment?
Edit: A more directed question: Why was it chosen that immutable types like frozenset()
have a reference upon construction, while mutable types like set()
does not? I understand in isolation why you might choose to keep this global-scope reference or not across the board, but why the discrepancy?
Solution
Answering my own question as I've learned more.
The difference has to do with the format of python bytecode objects. The bytecode of sys.getrefcount("arbitrary string")
is the following:
>>> dis.dis('sys.getrefcount("arbitrary string")')
1 0 LOAD_NAME 0 (sys)
2 LOAD_METHOD 1 (getrefcount)
4 LOAD_CONST 0 ('arbitrary string')
6 CALL_METHOD 1
8 RETURN_VALUE
Here, the LOAD_CONST opcode doesn't construct a new string from scratch, it merely loads a string out of the code object's tuple of constants. That tuple is what is holding the extra reference:
>>> f = lambda: sys.getrefcount("arbitrary string")
>>> f.__code__.co_consts
(None, 'arbitrary_string')
With this in mind, some examples make sense:
>>> import sys
# the string is stored in co_consts
>>> sys.getrefcount("arbitrary string")
2
# the integer is store in co_consts
>>> sys.getrefcount(1122334455)
2
# this addition is constant-folded and the result is stored in co_consts
>>> sys.getrefcount(1122334455+1)
2
# Tuples of constants are folded into one big constant:
>>> sys.getrefcount(("a", "r", "b", "i", "t", "r", "a", "r", "y", " ", "s", "t", "r", "i", "n", "g"))
2
Meanwhile, the following objects can't be stored in co_consts because they have to be reconstructed each time, since they are either mutable, or rely on a function call of some function that has to be looked up:
>>> sys.getrefcount(set()) # construct a new set at call time
1
>>> sys.getrefcount(object()) # construct a new object at call time
1
>>> sys.getrefcount([]) # construct a new list at call time
1
>>> sys.getrefcount(lambda x: x) # construct a new function at call time
1
# construct a new range object at call time.
# We could do something dumb like range = abs,
# so this can't be constant-folded.
>>> sys.getrefcount(range(1122334455))
1
>>> sys.getrefcount(dict()) # construct a new dict at call time
1
>>> sys.getrefcount(list("arbitrary string")) # construct a new list at call time
1
# Construct a new tuple at call time.
# We could do something dumb like tuple=list
# so this can't be constant-folded.
>>> sys.getrefcount(tuple("arbitrary string"))
1
Finally, the third classification of objects are those that make use of some kind of caching or internment, where when a new object is constructed, somehow the Python can cheat and give you an already-existing object.
# There is only one empty frozenset that gets re-used.
>>> sys.getrefcount(frozenset())
2
# There is only one empty tuple that gets re-used
# whenever someone requests an empty tuple
>>> sys.getrefcount(())
8341
# The same thing, but that tuple does get stored
# in co_consts because the name `tuple` could be rebound.
>>> sys.getrefcount(tuple())
8340
To verify the assertion about frozensets, note that the refcount of one empty frozenset increases when a new one is constructed:
>>> sys.getrefcount(frozenset())
2
>>> x, y, z = frozenset(), frozenset(), frozenset()
>>> sys.getrefcount(frozenset())
5
Answered By - Dennis Answer Checked By - Candace Johnson (PHPFixing Volunteer)
0 Comments:
Post a Comment
Note: Only a member of this blog may post a comment.