Hash Codes and Hashing

What is hashing?

Hashing is the process of taking any input and transforming it into a fixed length string. This output which is obtained is called the hash value/message digest. In informal terms, a hash is a sort of signature/identification for some stream of data which represents the value of the data. It is a one way transformation.

An analogy

Assume there exist 2 people – Alice and Bob. Both are high school students working on a mathematics problem and they are given extra credit for the answer. Now Alice solves the problem first, but doesn’t want to show Bob the solution. So she hashes the solution. When Bob arrives at the solution days after the deadline, he wants to check that Alice isn’t bluffing, and at the same time he doesn’t want to show her his solution also. So he hashes his answer using the same hash function, and compares both the hash values to check if they got the same answer.

Difference between hashing and encryption

Encryption transforms the plaintext into ciphertext, and decryption reverses the process and allows the original text to be reinstated.

Hashing Encryption

Hashing on the other hand, is a one way operation in which data is converted into a message digest. There is no decrypting the hashed messages.

Hashing

Applications

Confirming the integrity of a file

When you download a file from the internet for some very important use, how do you know that the file hasn’t been fiddled around with? If you downloaded it again and compared the bits on both trials and they turned out to be the same, then well and good. But what if it yielded different results? Then how do you know which file is the correct one? There’s a possibility that both may have been tampered with! Hashing provides a solution to avoid this mishap. The site providing the files should provide the hash values of the various files. When you download it, you can hash the file and compare it with the existing hash value on the site… and viola! You know if your file is authentic or not.

Password Hashing

Password Hashing

Imagine if the authenticating computer systems stored all your passwords in plain text! What if a hacker managed to break into the system? The results of such an attack would be catastrophic. This is where hashing saves the day. The passwords are hashed by the system and saved in password store. When a user keys in their password, it hashes what they’ve keyed in and checks it with the hashed value stored in the store. Only if there is an exact match, the user is provided access to his email account. This way, if a hacker did break in, he would gain nothing apart from a set of alphabets and numbers which he can do nothing with!

Hashing

Hash function

The hash function actually takes care of the transformation. There are various hash functions in common use today. MD-4, MD-5, SHA-1 etc, to name a few. Let’s look into the basic level working of a few.

MD-4

It is a hashing algorithm that results in 128-bit message digest. Designed by Ronald Rivest, this algorithm is mostly used for digitally signing documents where large files need to be securely compressed before encrypting with a secret key, under a public key cryptographic system.

Example:

Input: HashThat

Output: 74cbd3dc6167b9c74f99f6f086b45af0

Even on changing one character, the hash value is changed.

Input: HashThan

Output: 0181a03f364ff52567aa5ddf6a0e2aff

MD-5 is an extension of MD-4, for which it is declared to be infeasible from a computational angle that two different inputs have the same hash value. Collision resistance methods have been developed to avoid such a scenario because the effects of such a situation would be catastrophic!

Hashing Mistake

SHA

This hashing algorithm designed by the NSA (National Security Agency), results in a 160-bit message. It is mainly incorporated for utilization along with the DSS (Digital Signature Standard). It is different from MD-4 in the sense that it has an additional expansion operation, a further round. This transformation was basically done to accommodate DSS block size effectively. There are various versions – SHA1, SHA-2 etc.

Using SHA-1,

Input: HashThat

Output: 61677d1964f3bd2a5a0c82aab85aec6511771e97

Reference

An Illustrated guide to cryptographic hashes, Steve Friedl
http://www.unixwiz.net/techtips/iguide-crypto-hashes.html
Hashing Algorithms
http://www.networksorcery.com/enp/data/hashing.htm
MD-4, Wikipedia article
http://en.wikipedia.org/wiki/MD4
gtools – free DNS, SEO and hash tools
http://gtools.org/tool/md4-hash-generator/
SHA1 Secure Hash Algorithm - Version 1.0
http://www.w3.org/PICS/DSig/SHA1_1_0.html
Image source:
http://www.unixwiz.net/techtips/iguide-crypto-hashes.html
Published with permission from Steve Friedl
http://www.unixwiz.net/