# Issue

I dont find any info about % of collisions for xxhash64.

I'm going to use it for cache system (to generate hash keys which need to be unique, about a hundreds millions). Now i use md5, but i don't need cryptographic property.

So i need some info, to decide does is it a good decision for my task. In best case - comparison of the number of collisions between md5 and xxHash64.

# Solution

You can calculate yourself by using the birthday problem.

In general the mathematical expression that gives you the probability of hash function is :

p(k) = 1 - exp(-k(k-1)/2N, k (number of hashes) randomly generated values, where each value is a non-negative integer less than N (number of possible hashes):N = 2^(number of bit), example for md5 it is 2^128, or 2^32 for 32 bit-hash

**If you use md5**

will produce a 128-bit hash value, by applying this formula you get this 'S' graph. This graph explains, for example, in order to get a collison probability of 50% (0.5), you need at least 21 000 000 trillion of hashes or 21 quintillion of hashes!!!! If you we use less than, for instance 1 billion of hashes, the probability of collision is negligible.

*If you are using hundred millions of hashed keys, the probability of collision is 0% using md5.*

**If you use xxhash64,**

**Assuming that xxhash64 produce a 64-bit hash**. You will get this graph.

According to this picture, you can see that if the collision percentage is 50%, you need at least 5 billion of hashes. Two of the 5 billion of hashes can have an odd of 1/2 to have the same hashes!!! If you have around 12 billion of hashes there is 100% of chance that the hashes collide.

*If you are using hundred millions of hashed keys, the probability of collision is 0.033% using xxhash64*.

This link explains why md5 or fast hash method are not secure.

Answered By - Michael GEDION Answer Checked By - Gilberto Lyons (PHPFixing Admin)

## 0 Comments:

## Post a Comment

Note: Only a member of this blog may post a comment.