Blockchain technology is one of the most innovative discoveries in recent years. One of its core principles is the hash function.

What is Hashing?

Hashing is, simply put, taking an input string of any length and giving out an output of a fixed length.

Cryptographic hashing refers a special class of hash functions with set properties. To be considered secure, a cryptographic hash function needs to include properties such as always getting a consistent result irrespective of how many times you parse through an input, quick computation, and pre-image resistant among others.

In case of cryptocurrencies such as bitcoin, the transactions are taken as an input and run through a hashing algorithm (Bitcoin uses SHA-256) which gives an output of a fixed length.

Each input has its own unique hash. For examples, take inputs A and B where H(A) and H(B) are their respective hashes. It is infeasible for H(A) to be equal to H(B). Infeasible but unfortunately, not impossible.

What is a Hash Function Attack?

A hash function attack is an attempt to find two input strings of a hash function that produce the same hash result. Because hash functions have infinite input length and a predefined output length, there is inevitably going to be the possibility of two different inputs that produce the same output hash.

A hash collision occurs when two separate inputs produce the same hash output. This can be exploited by an application that compares two hashes together (such as password hashes, file integrity checks). However, the odds of a collision are extremely low, especially for functions with a large output size such as lengthy and widespread document formats or protocols but as available computational power increases, the ability to attack hash functions becomes more feasible.

How Does a Hash Function Attack Occur?

There are several ways a hash collision could be exploited. There are mainly three types of hash function attacks:

Collision attack: A collision attack on a cryptographic hash tries to find two inputs producing the same hash value. The attacker does not have control over the content of the message, but they are arbitrarily chosen by the algorithm. In this case, H(A) is equal to H(B).

Pre-image attack: In contrast to a collision attack, in a pre-image attack the hash value is specified.

Birthday attack: The birthday attack is based on the birthday paradox, i.e., the probability that in a set of n randomly chosen people, some pair of them will have the same birthday. Applied to hash function attacks, this means you have a 50% chance to break the collision resistance.

How Secure are Hash Functions?

No hash function is collision free, but it usually takes extremely long to find a collision.

Even if a hash function has never been broken, a successful attack against a weakened variant may undermine the experts’ confidence and lead to its abandonment. In the past, weaknesses had been found in several then-popular hash functions, including SHA-0, RIPEMD, and MD5. These weaknesses called into question the security of stronger algorithms derived from the weak hash functions such as the SHA-1, RIPEMD-128, and RIPEMD-160.

Also, there are applications of cryptographic hash functions that do not rely on collision resistance. This means that collision attacks do not affect their security. For example, HMACs are not vulnerable. For the hash attack to be successful, the attacker must be in control of the input to the hash function.

Are hash function attacks something to worry about?

The truth is that it depends on the hash function. Even MD5 and SHA-1 are not completely collision resistant but stronger functions such as SHA-256 appear to be safe for now.