# Issue

I recently answered a question on a sister site which asked for a function that counts all even digits of a number. One of the other answers contained two functions (which turned out to be the fastest, so far):

```
def count_even_digits_spyr03_for(n):
count = 0
for c in str(n):
if c in "02468":
count += 1
return count
def count_even_digits_spyr03_sum(n):
return sum(c in "02468" for c in str(n))
```

In addition I looked at using a list comprehension and `list.count`

:

```
def count_even_digits_spyr03_list(n):
return [c in "02468" for c in str(n)].count(True)
```

The first two functions are essentially the same, except that the first one uses an explicit counting loop, while the second one uses the built-in `sum`

. I would have expected the second one to be faster (based on e.g. this answer), and it is what I would have recommended turning the former into if asked for a review. But, it turns out it is the other way around. Testing it with some random numbers with increasing number of digits (so the chance that any single digit is even is about 50%) I get the following timings:

**Why is the manual for loop so much faster?** It's almost a factor two faster than using

`sum`

. And since the built-in `sum`

should be about five times faster than manually summing a list (as per the linked answer), it means that it is actually ten times faster! Is the saving from only having to add one to the counter for half the values, because the other half gets discarded, enough to explain this difference?Using an `if`

as a filter like so:

```
def count_even_digits_spyr03_sum2(n):
return sum(1 for c in str(n) if c in "02468")
```

Improves the timing only to the same level as the list comprehension.

When extending the timings to larger numbers, and normalizing to the `for`

loop timing, they asymptotically converge for very large numbers (>10k digits), probably due to the time `str(n)`

takes:

# Solution

`sum`

is quite fast, but `sum`

isn't the cause of the slowdown. Three primary factors contribute to the slowdown:

- The use of a generator expression causes overhead for constantly pausing and resuming the generator.
- Your generator version adds unconditionally instead of only when the digit is even. This is more expensive when the digit is odd.
- Adding booleans instead of ints prevents
`sum`

from using its integer fast path.

Generators offer two primary advantages over list comprehensions: they take a lot less memory, and they can terminate early if not all elements are needed. They are *not* designed to offer a time advantage in the case where all elements are needed. Suspending and resuming a generator once per element is pretty expensive.

If we replace the genexp with a list comprehension:

```
In [66]: def f1(x):
....: return sum(c in '02468' for c in str(x))
....:
In [67]: def f2(x):
....: return sum([c in '02468' for c in str(x)])
....:
In [68]: x = int('1234567890'*50)
In [69]: %timeit f1(x)
10000 loops, best of 5: 52.2 µs per loop
In [70]: %timeit f2(x)
10000 loops, best of 5: 40.5 µs per loop
```

we see an immediate speedup, at the cost of wasting a bunch of memory on a list.

If you look at your genexp version:

```
def count_even_digits_spyr03_sum(n):
return sum(c in "02468" for c in str(n))
```

you'll see it has no `if`

. It just throws booleans into `sum`

. In constrast, your loop:

```
def count_even_digits_spyr03_for(n):
count = 0
for c in str(n):
if c in "02468":
count += 1
return count
```

only adds anything if the digit is even.

If we change the `f2`

defined earlier to also incorporate an `if`

, we see another speedup:

```
In [71]: def f3(x):
....: return sum([True for c in str(x) if c in '02468'])
....:
In [72]: %timeit f3(x)
10000 loops, best of 5: 34.9 µs per loop
```

`f1`

, identical to your original code, took 52.2 µs, and `f2`

, with just the list comprehension change, took 40.5 µs.

It probably looked pretty awkward using `True`

instead of `1`

in `f3`

. I wrote `True`

there because changing it to `1`

activates one final speedup. `sum`

has a fast path for integers, but the fast path only activates for objects whose type is exactly `int`

. `bool`

doesn't count. This is the line that checks that items are of type `int`

:

```
if (PyLong_CheckExact(item)) {
```

Once we make the final change, changing `True`

to `1`

:

```
In [73]: def f4(x):
....: return sum([1 for c in str(x) if c in '02468'])
....:
In [74]: %timeit f4(x)
10000 loops, best of 5: 33.3 µs per loop
```

we see one last small speedup.

So after all that, do we beat the explicit loop?

```
In [75]: def explicit_loop(x):
....: count = 0
....: for c in str(x):
....: if c in '02468':
....: count += 1
....: return count
....:
In [76]: %timeit explicit_loop(x)
10000 loops, best of 5: 32.7 µs per loop
```

Nope. We've roughly broken even, but we're not beating it. The big remaining problem is the list. Building it is expensive, and `sum`

has to go through the list iterator to retrieve elements, which has its own cost (though I think that part is pretty cheap). Unfortunately, as long as we're going through the test-digits-and-call-`sum`

approach, we don't have any good way to get rid of the list. The explicit loop wins.

Can we go further anyway? Well, we've been trying to bring the `sum`

closer to the explicit loop so far, but if we're stuck with this dumb list, we could diverge from the explicit loop and just call `len`

instead of `sum`

:

```
def f5(x):
return len([1 for c in str(x) if c in '02468'])
```

Testing digits individually isn't the only way we can try to beat the loop, too. Diverging even further from the explicit loop, we can also try `str.count`

. `str.count`

iterates over a string's buffer directly in C, avoiding a lot of wrapper objects and indirection. We need to call it 5 times, making 5 passes over the string, but it still pays off:

```
def f6(x):
s = str(x)
return sum(s.count(c) for c in '02468')
```

Unfortunately, this is the point when the site I was using for timing stuck me in the "tarpit" for using too many resources, so I had to switch sites. The following timings are not directly comparable with the timings above:

```
>>> import timeit
>>> def f(x):
... return sum([1 for c in str(x) if c in '02468'])
...
>>> def g(x):
... return len([1 for c in str(x) if c in '02468'])
...
>>> def h(x):
... s = str(x)
... return sum(s.count(c) for c in '02468')
...
>>> x = int('1234567890'*50)
>>> timeit.timeit(lambda: f(x), number=10000)
0.331528635986615
>>> timeit.timeit(lambda: g(x), number=10000)
0.30292080697836354
>>> timeit.timeit(lambda: h(x), number=10000)
0.15950968803372234
>>> def explicit_loop(x):
... count = 0
... for c in str(x):
... if c in '02468':
... count += 1
... return count
...
>>> timeit.timeit(lambda: explicit_loop(x), number=10000)
0.3305045129964128
```

Answered By - user2357112 Answer Checked By - Cary Denson (PHPFixing Admin)

## 0 Comments:

## Post a Comment

Note: Only a member of this blog may post a comment.