Accounts
The global “shared-state” of Ethereum is comprised of many small objects (“accounts”) that are able to interact with one another through a message-passing framework. Each account has a state associated with it and a 20-byte address. An address in Ethereum is a 160-bit identifier that is used to identify any account.
There are two types of accounts:
Externally owned accounts, which are controlled by private keys and have no code associated with them.
Contract accounts, which are controlled by their contract code and have code associated with them.
Image for post
Externally owned accounts vs. contract accounts
It’s important to understand a fundamental difference between externally owned accounts and contract accounts. An externally owned account can send messages to other externally owned accounts OR to other contract accounts by creating and signing a transaction using its private key. A message between two externally owned accounts is simply a value transfer. But a message from an externally owned account to a contract account activates the contract account’s code, allowing it to perform various actions (e.g. transfer tokens, write to internal storage, mint new tokens, perform some calculation, create new contracts, etc.).
Unlike externally owned accounts, contract accounts can’t initiate new transactions on their own. Instead, contract accounts can only fire transactions in response to other transactions they have received (from an externally owned account or from another contract account). We’ll learn more about contract-to-contract calls in the “Transactions and Messages” section.
Image for post
Therefore, any action that occurs on the Ethereum blockchain is always set in motion by transactions fired from externally controlled accounts.
Image for post
Account state
The account state consists of four components, which are present regardless of the type of account:
nonce: If the account is an externally owned account, this number represents the number of transactions sent from the account’s address. If the account is a contract account, the nonce is the number of contracts created by the account.
balance: The number of Wei owned by this address. There are 1e+18 Wei per Ether.
storageRoot: A hash of the root node of a Merkle Patricia tree (we’ll explain Merkle trees later on). This tree encodes the hash of the storage contents of this account, and is empty by default.
codeHash: The hash of the EVM (Ethereum Virtual Machine — more on this later) code of this account. For contract accounts, this is the code that gets hashed and stored as the codeHash. For externally owned accounts, the codeHash field is the hash of the empty string.
Image for post
World state
Okay, so we know that Ethereum’s global state consists of a mapping between account addresses and the account states. This mapping is stored in a data structure known as a Merkle Patricia tree.
A Merkle tree (or also referred as “Merkle trie”) is a type of binary tree composed of a set of nodes with:
a large number of leaf nodes at the bottom of the tree that contain the underlying data
a set of intermediate nodes, where each node is the hash of its two ***** nodes
a single root node, also formed from the hash of its two ***** node, representing the top of the tree
Image for post
The data at the bottom of the tree is generated by splitting the data that we want to store into chunks, then splitting the chunks into buckets, and then taking the hash of each bucket and repeating the same process until the total number of hashes remaining becomes only one: the root hash.
Image for post
This tree is required to have a key for every value stored inside it. Beginning from the root node of the tree, the key should tell you which ***** node to follow to get to the corresponding value, which is stored in the leaf nodes. In Ethereum’s case, the key/value mapping for the state tree is between addresses and their associated accounts, including the balance, nonce, codeHash, and storageRoot for each account (where the storageRoot is itself a tree).
Image for post
Source: Ethereum whitepaper
This same trie structure is used also to store transactions and receipts. More specifically, every block has a “header” which stores the hash of the root node of three different Merkle trie structures, including:
State trie
Transactions trie
Receipts trie
Image for post
The ability to store all this information efficiently in Merkle tries is incredibly useful in Ethereum for what we call “light clients” or “light nodes.” Remember that a blockchain is maintained by a bunch of nodes. Broadly speaking, there are two types of nodes: full nodes and light nodes.
A full archive node synchronizes the blockchain by downloading the full chain, from the genesis block to the current head block, executing all of the transactions contained within. Typically, miners store the full archive node, because they are required to do so for the mining process. It is also possible to download a full node without executing every transaction. Regardless, any full node contains the entire chain.
But unless a node needs to execute every transaction or easily query historical data, there’s really no need to store the entire chain. This is where the concept of a light node comes in. Instead of downloading and storing the full chain and executing all of the transactions, light nodes download only the chain of headers, from the genesis block to the current head, without executing any transactions or retrieving any associated state. Because light nodes have access to block headers, which contain hashes of three tries, they can still easily generate and receive verifiable answers about transactions, events, balances, etc.
The reason this works is because hashes in the Merkle tree propagate upward — if a malicious user attempts to swap a fake transaction into the bottom of a Merkle tree, this change will cause a change in the hash of the node above, which will change the hash of the node above that, and so on, until it eventually changes the root of the tree.
Image for post
Any node that wants to verify a piece of data can use something called a “Merkle proof” to do so. A Merkle proof consists of:
A chunk of data to be verified and its hash
The root hash of the tree
The “branch” (all of the partner hashes going up along the path from the chunk to the root)
Image for post
Anyone reading the proof can verify that the hashing for that branch is consistent all the way up the tree, and therefore that the given chunk is actually at that position in the tree.
In summary, the benefit of using a Merkle Patricia tree is that the root node of this structure is cryptographically dependent on the data stored in the tree, and so the hash of the root node can be used as a secure identity for this data. Since the block header includes the root hash of the state, transactions, and receipts trees, any node can validate a small part of state of Ethereum without needing to store the entire state, which can be potentially unbounded in size.
33 bitcoin