Clock 5 MINUTES

What is a Merkle Tree and How Does it Affect Blockchain Technology?

Merkle Trees are one of the reasons blockchain technology has managed to be so successful today. Here’s a comprehensive overview of what they are and why they matter.

15 Nov
2019
SelfKey
Identity Wallet
key-icon

If you are involved in the cryptocurrency world, you have probably seen the phrase “Merkle Tree” and felt a little lost (in all honesty it sounds like a tree named after German chancellor Angela Merkel). Even within the crypto community, Merkle Trees (also known as binary hash trees) are not a widely-understood concept, but they are also not that complicated.

In very simple terms, a Merkle Tree is a way of structuring data that allows a large body of information to be verified for accuracy both extremely efficiently and quickly. They have become a crucial component of blockchain technology and cryptocurrency, so let’s take a closer look.

In this article we cover all you need to know about Merkle Trees, and why they are important for blockchain technology.

What is a Merkle Tree?

The Merkle Tree has been around since 1979, when a man named Ralph Merkle was at Stanford University. Merkle wrote a paper titled “A Certified Digital Signature” during his time at Stanford, and unknowingly created a major component of blockchain. In his paper, Merkle described a brand new method of creating proofs. Essentially, Merkle designed a process for verifying data that would allow computers to work much faster than before.

Merkle’s idea, now called a Merkle Tree, fundamentally changed the world of cryptography, including the way encrypted computer protocols function. As a result, Merkle Trees have grown in popularity over the years, especially when it comes to cryptocurrency. In fact, they are mentioned several times in Satoshi Nakamoto’s essay that introduced Bitcoin, and they are used considerably in Bitcoin’s foundational code. Other cryptocurrencies like Ethereum have adopted Merkle Trees as well.

The SelfKey Identity Wallet is a free identity solution for Windows, Linux and Mac. Get yours today!

Before we dive further into Merkle Trees, it is important to mention blockchain. As you may already know, each transaction on a blockchain has its own unique transaction ID. For most blockchains, an ID is a 64-character code that takes up 256 bits (or 32 bytes) of memory.

When you think about the fact that blockchains are usually made up of hundreds of thousands of blocks, and that each block can contain up to several thousand transactions, it becomes clear that memory space and computing power are two big problems. As a result, it is advantageous to use as little data as possible when processing and verifying transactions. It not only reduces CPU processing times, but also ensures a higher level of security.

And that is precisely what Merkle Trees do. Essentially, Merkle Trees take an enormous number of transaction IDs and run them through a mathematical process that results in one 64-character code, which is called a Merkle Root. The Merkle Root is vital because it authorizes any computer to quickly verify that a specific transaction took place on a certain block as accurately as possible. 

What is a Merkle Root?

Before we dive further into this topic, it is important to briefly talk about hashing. Hashing functions are mathematical algorithms that take inputs and generate unique outputs. Some of the most common hashing functions are MD5, SHA-3, and SHA-256 – the last of which is used by Bitcoin. We will come back to hashing in a minute.

As we now know, the single code that a Merkle Tree produces is referred to as a Merkle Root. Each individual block in a blockchain has one. How a Merkle Root is produced is slightly more complicated, but important to understand. 

By design, Merkle Trees always group all of the data inputs into pairs. If there happens to be an odd number of inputs, then the last input is copied and paired with itself. This applies to all transaction IDs written onto a block in a blockchain.

Here is an example. Let’s say that a single block contains a total of 424 transactions. The Merkle Tree would start by grouping these transactions into 212 pairs. The next step is for the 212 transaction ID pairs to go through a hashing function. This would result in 212 new 64-character codes.

The process continues. The 212 new codes would be paired up and turned into 106 pairs. The process is repeated again, cutting the number of codes in half each time, until only one code remains. That code is the Merkle Root.

A Merkle Tree Example

To help solidify this concept, here is a very simple example of a Merkle Tree. Let’s imagine that there were four transactions performed on one block: A, B, C, and D. Each transaction is then hashed, leaving us with: 

Hash A

Hash B

Hash C

Hash D

The hashes are paired together resulting in:

Hash AB

and

Hash CD

These two hashes are hashed together to give us our Merkle Root: Hash ABCD. In reality, a Merkle Tree is far more complicated than this (especially when you consider that each transaction ID is 64 characters long) but this should give you an idea as to how the algorithms work and why it is so effective. 

The Benefit of Merkle Trees

A Merkle Tree can considerably reduce the amount of data that has to be maintained for verification purposes. It can reside locally or on a distributed system. In essence, a Merkle Tree separates the validation of data from the data itself.

Merkle Trees have four sizable benefits:

  1. They provide a way to prove both the integrity and validity of data
  2. They significantly reduce the amount of memory needed to do the above
  3. The required proof and management only needs small amounts of information to be transmitted across networks
  4. Simplified Payment Verification (SPV) – a way of verifying transactions in a block without downloading an entire block. Often used by lightweight Bitcoin clients.

It is essential to blockchain technology that a log can be proved to be complete and consistent. Merkle Trees help validate that later versions of a log include everything from an earlier version. They also ensure that all data is recorded and presented in chronological order. Proving that a log is consistent can be difficult, as it calls for showing that no prior records have been altered, added, or tampered with. Additionally, it also needs to be shown that the log has never been forked or branched.

Merkle Trees benefit both users and miners on a blockchain. Users can verify individual parts of blocks, and can also check transactions by using hashes from other branches of the Merkle Tree. Miners can calculate hashes progressively as they receive transactions from their peers.

Why Merkle Trees are Vital for Blockchain

To understand just how important Merkle Trees are for blockchain technology, you need to imagine a blockchain without them. We will primarily be referring to Bitcoin here as its use of Merkle Trees is not only vital for the cryptocurrency, but also easy to understand. For example, if Bitcoin did not have Merkle Trees, every single node on the network would need to keep a complete copy of every single transaction that has ever occurred on Bitcoin. You can imagine how much information that would be.

When confirming a past transaction, a node would need to reach out to the network in order to get copies of the ledger from its peers. The node would need to compare each entry line by line to make sure its own records and the network records matched exactly. If there was any discrepancy between the ledgers, it could compromise the security of the network. 

Every verification request on Bitcoin would require insanely large packets of information to be sent over the network, because in order to validate the data you need to have the data itself. The computer used for validating would need to apply a lot of processing power to compare the ledgers to ensure that there had been no changes.

Merkle Trees solve this problem. They hash the records in the ledger, which effectively separates the proof of data from the data itself. Proving a transaction is valid only involves sending small amounts of information across the network. Additionally, it allows you to prove that both versions of the ledger are the same with nominal amounts of computing power and network bandwidth.

Ethereum uses Merkle Trees, however, they utilize a more complicated method. It is called a Merkle Patricia Tree and it uses three different Merkle Roots for each block. 

Conclusion

In all likelihood, if Merkle Trees had never been invented, cryptocurrency and blockchain technology would have never existed. Unless there was a viable alternative, the amount of computing power and storage would simply be too expensive to run. It is fascinating that a relatively old idea is still being used in modern technology, and is one of the reasons this technology can run efficiently in the first place.

Merkel Trees are vital for blockchains and allow them to function effectively along with maintaining transaction integrity. Understanding the role that Merkle Trees play is essential to understanding the basic concepts within cryptocurrencies as they continue to develop.