0.1 Hash Functions

Let's start with understanding the most important tool in the cryptographer's tool box: hash functions. Hash functions are used everywhere: from building quickly searchable data structures, to storing passwords or encrypting messages and verifying data integrity.

Let's say you download a huge file on the internet, that could be a list of transactions in a ledger for instance. Maybe your internet connection is not so stable and some packets might have been tempered with. To ensure your file is not corrupted, you could use a function that takes the file data as an input and computes a fingerprint identifier for the file, and compare that with the identifier posted by the person who shared that file to you, or even the one from other people to ensure you all have the same version of the file. Such a function should be deterministic, fast to compute and collision resistant we'll describe what that means exactly later in that chapter.

Now let's take another example, let's say you run a website where you have to store your users accounts, including an email and password to authenticate them. If you were to store users password directly you would risk them being stolen on the first hack of your database. You could instead store a fingerprint of that password on user registration. Whenever logging in, users would just give their password, you would compute the fingerprint and compare it with the version you stored to ensure it is the right password. Such a function to compute fingerprint should then be pre-image resistance, deterministic, collision resistance as we'll, again, see later.

Both those use cases require a hash function. A cryptographic hash function takes an input (a message, a transaction, a block of data, or an entire ledger, whatever sequence of bytes really) and produces a fixed-size element, which is its hash. It could be a string of characters, or an element of $\mathbb{Z}/p\mathbb{Z}$ where $p$ is a prime number. This hash is unique to the input data - any change to the input, even a single character, will produce a completely different hash. In more formal terms, if $h:\{0,1\}^{*}\rightarrow\mathbb{Z}/p\mathbb{Z}$ :

Injection: for two keys k1 ≠ k2, the hash function should give different results h(k1) ≠ h(k2), with probability $\frac{p-1}{p}$ .
Diffusion (stronger than injection): if k1 ≠ k2, knowing h(k1) gives no information about h(k2). For example, if k2 is exactly the same as k1, except for one bit, then every bit in h(k2) should change with 1/2 probability compared to h(k1). Knowing the bits of h(k1) does not give any information about the bits of h(k2).

A good cryptographic hash function should have the following properties:

Deterministic – The same input always produces the same output.
Fast computation – The hash can be quickly computed for any given input.
Pre-image resistance – Given a hash, it is nearly impossible to determine the original input.
Small changes cause big differences – A tiny modification to the input drastically changes the output.
Collision resistance – It is infeasible to find two different inputs that produce the same hash.

Hashing in Blockchain

Hash function can be used to ensure the integrity and immutability of the ledger itself. A decentralized system must prevent anyone from altering past transactions while still allowing new transactions to be added. This is where cryptographic hashing comes into play.

In a blockchain, every block, a new page of transaction in the ledger, contains:

A list of transactions
A timestamp
A hash of the previous block
A nonce

By including the hash of the previous block, each block is cryptographically linked to the one before it. This creates a chain of blocks, forming an immutable ledger. If someone tries to alter a past transaction, they would need to change all following blocks hash as well, making tampering practically impossible.

Example: SHA-256

A widely used cryptographic hash function in blockchain systems is SHA-256 (Secure Hash Algorithm 256-bit). When a transaction is processed, it is hashed using SHA-256, producing a unique 256-bit hash. This hash is then stored in the blockchain, and represents the transaction, ensuring it cannot be altered.

For example, hashing the message:

A sends 3 coins to B

might produce the hash:

464982b640260c16e9c5938e6f799fd655bc29ae415135bda51021632148e7a4

Even the slightest change in the message (e.g., A sends 3.1 coins to B) would generate a completely different hash:

569d1670830b7cd00968c52ebc678574974eb2bc6a7e304fb786231cfb2d1a3f

Cryptographic hashing ensures that transactions recorded in the ledger remain immutable.

Commitment Schemes: Locking Secrets Before Revealing

Alright, so now that we’ve got a solid grasp of cryptographic hash functions and how they ensure data integrity, let’s talk about another interesting cryptographic trick: commitment schemes. These are like digital envelopes - you lock something inside them, and no one can peek until you decide to open it.

What’s a Commitment Scheme?

Imagine you and a friend want to make a bet about the outcome of a sports game. You want to write down your prediction before the game starts so that you can prove you were right later - but you also don’t want your friend to see it in advance. This is where a commitment scheme comes in handy!

A commitment scheme lets you:

Commit to a value: You choose a value (e.g., "Team A will win"), and you create a cryptographic commitment to it, essentially locking it in a secure way.
Reveal the value later: Once the game is over, you can show your original prediction, and your friend can verify that you didn’t change it.

The Two Key Properties of a Commitment Scheme

For this to be useful, a commitment scheme needs two important properties:

Binding: Once you commit to a value, you can’t change it later. It’s like sealing a prediction in an envelope - if you try to swap out the paper inside, people will know something’s up.
Hiding: No one should be able to figure out what you committed to until you reveal it. The envelope is opaque until you decide to open it.

How Does This Work in Cryptography?

Commitment schemes often use hash functions to ensure both binding and hiding. A simple way to commit to a value is to hash it along with some extra secret randomness (called a nonce or salt). This randomness ensures that even if two people commit to the same value, their commitments look completely different.

For example:

You commit to "Team A wins" by hashing:
commitment = hash("Team A wins" + random_value)
Later, when you reveal your prediction, you show both the original message ("Team A wins") and the random value you used.
Your friend can now verify your commitment by hashing your revealed value and checking that it matches your original commitment.

In the next chapter, we will explore the different type of methods that can be used to encrypt and decrypt messages, as well as how it allows to identify users and to authenticate them when building a blockchain.

PreviousCryptography and Blockchain fundamentals Next0.2 Symmetric vs Asymmetric Encryption

Last updated 1 month ago