# Issue

It is my understanding that the `range()`

function, which is actually an object type in Python 3, generates its contents on the fly, similar to a generator.

This being the case, I would have expected the following line to take an inordinate amount of time because, in order to determine whether 1 quadrillion is in the range, a quadrillion values would have to be generated:

```
1_000_000_000_000_000 in range(1_000_000_000_000_001)
```

Furthermore: it seems that no matter how many zeroes I add on, the calculation more or less takes the same amount of time (basically instantaneous).

I have also tried things like this, but the calculation is still almost instant:

```
# count by tens
1_000_000_000_000_000_000_000 in range(0,1_000_000_000_000_000_000_001,10)
```

If I try to implement my own range function, the result is not so nice!

```
def my_crappy_range(N):
i = 0
while i < N:
yield i
i += 1
return
```

What is the `range()`

object doing under the hood that makes it so fast?

Martijn Pieters's answer was chosen for its completeness, but also see abarnert's first answer for a good discussion of what it means for `range`

to be a full-fledged *sequence* in Python 3, and some information/warning regarding potential inconsistency for `__contains__`

function optimization across Python implementations. abarnert's other answer goes into some more detail and provides links for those interested in the history behind the optimization in Python 3 (and lack of optimization of `xrange`

in Python 2). Answers by poke and by wim provide the relevant C source code and explanations for those who are interested.

# Solution

The Python 3 `range()`

object doesn't produce numbers immediately; it is a smart sequence object that produces numbers *on demand*. All it contains is your start, stop and step values, then as you iterate over the object the next integer is calculated each iteration.

The object also implements the `object.__contains__`

hook, and *calculates* if your number is part of its range. Calculating is a (near) constant time operation ^{*}. There is never a need to scan through all possible integers in the range.

From the `range()`

object documentation:

The advantage of the

`range`

type over a regular`list`

or`tuple`

is that a range object will always take the same (small) amount of memory, no matter the size of the range it represents (as it only stores the`start`

,`stop`

and`step`

values, calculating individual items and subranges as needed).

So at a minimum, your `range()`

object would do:

```
class my_range:
def __init__(self, start, stop=None, step=1, /):
if stop is None:
start, stop = 0, start
self.start, self.stop, self.step = start, stop, step
if step < 0:
lo, hi, step = stop, start, -step
else:
lo, hi = start, stop
self.length = 0 if lo > hi else ((hi - lo - 1) // step) + 1
def __iter__(self):
current = self.start
if self.step < 0:
while current > self.stop:
yield current
current += self.step
else:
while current < self.stop:
yield current
current += self.step
def __len__(self):
return self.length
def __getitem__(self, i):
if i < 0:
i += self.length
if 0 <= i < self.length:
return self.start + i * self.step
raise IndexError('my_range object index out of range')
def __contains__(self, num):
if self.step < 0:
if not (self.stop < num <= self.start):
return False
else:
if not (self.start <= num < self.stop):
return False
return (num - self.start) % self.step == 0
```

This is still missing several things that a real `range()`

supports (such as the `.index()`

or `.count()`

methods, hashing, equality testing, or slicing), but should give you an idea.

I also simplified the `__contains__`

implementation to only focus on integer tests; if you give a real `range()`

object a non-integer value (including subclasses of `int`

), a slow scan is initiated to see if there is a match, just as if you use a containment test against a list of all the contained values. This was done to continue to support other numeric types that just happen to support equality testing with integers but are not expected to support integer arithmetic as well. See the original Python issue that implemented the containment test.

* *Near* constant time because Python integers are unbounded and so math operations also grow in time as N grows, making this a O(log N) operation. Since it’s all executed in optimised C code and Python stores integer values in 30-bit chunks, you’d run out of memory before you saw any performance impact due to the size of the integers involved here.

Answered By - Martijn Pieters Answer Checked By - Clifford M. (PHPFixing Volunteer)

## 0 Comments:

## Post a Comment

Note: Only a member of this blog may post a comment.