Diving Deep into the Ethereum Yellow Paper: A Comprehensive Technical Guide

CollinSherriff
5 min readJul 14, 2023

TL;DR: Sorry, but there is no TL;DR. You can give this PDF a shot, but good luck: https://ethereum.github.io/yellowpaper/paper.pdf

The Ethereum Yellow Paper, authored by Dr. Gavin Wood, is the definitive technical document that underpins the Ethereum protocol. It outlines the Ethereum Virtual Machine (EVM), the internal scripting language, the block structure, and the overall functioning of the Ethereum ecosystem. This post will delve deep into the Yellow Paper and provide a comprehensive technical overview of its key concepts.

Ethereum Protocol: A Mathematical Perspective

Ethereum’s power as a decentralized platform stems from its basis in complex mathematical concepts and robust algorithms. Its design is such that every computational step, transaction verification, and consensus mechanism follows a defined set of mathematical rules and protocols.

The Ethereum Virtual Machine (EVM)

The EVM is the cornerstone of the Ethereum network, responsible for executing all smart contracts. It’s described in the Yellow Paper as a quasi-Turing complete machine, which refers to its ability to solve any computational problem, given enough time and resources.

The EVM’s operations are rooted in stack-based bytecode executed on a simple stack machine. This means it processes instructions by adding and removing data on a Last-In-First-Out (LIFO) data structure known as a stack. Each operation has a cost associated with it, measured in ‘gas’, calculated using complex mathematical formulas based on computational complexity.

Gas and Gas Cost

Ethereum’s concept of ‘gas’ is a fundamental aspect of the network’s computational economics. Gas serves as a measurement of computational work, effectively metering the execution of transactions based on their complexity.

The Yellow Paper specifies an intricate model for gas calculation and consumption. It lists base gas costs for each operation in an appendix, calculated based on the operation’s computational intensity.

The actual cost of a transaction in Ether (ETH) is derived using the formula:

transaction cost (ETH) = gas used * gas price

where the gas price is a value set by the transaction initiator.

Ethereum Accounts

Ethereum’s ledger maintains a globally accessible data structure called the “World State” (W). It’s a mapping between addresses (160-bit identifiers) and account states. Mathematically, it’s defined as a function W: A -> S where A is the set of Ethereum addresses and S is the set of all possible account states.

There are two types of accounts: Externally Owned Accounts (EOAs) and Contract Accounts. EOAs have private keys and can initiate transactions. Contract Accounts hold contract code and can’t initiate transactions but can process message calls (transactions) from EOAs or other contracts.

Another blog on Account Abstraction (ERC-4337) coming soon! A true game-changer.

Block Structure

Ethereum’s block structure is defined using intricate mathematical data structures. Each block header (H) consists of fifteen pieces of data including the parent hash, ommers hash, beneficiary, state root and more. The ommers hash represents the hash of the current block’s ommers (equivalent to uncles in Bitcoin), which ensures fairer rewards and improves network security.

Ethereum’s State Transition Function

The Yellow Paper defines the state transition function as σ: S x T -> S’, where σ is the state transition function, S is the current state, T is the transaction and S’ is the new state.

The state transition function takes the current state and a transaction, verifies the transaction according to a set of predefined rules, applies the transaction, and then returns the new state.

Patricia Trees (Trie)

The Yellow Paper introduces a version of the Modified Merkle Patricia Tree (trie) data structure for efficient state storage and manipulation. It’s essentially a type of search tree, an ordered tree data structure used to store an associative array where the keys are usually numbers, and each leaf holds a value (account state or transaction data). Ethereum uses this tree structure to maintain a cryptographically authenticated data structure, a crucial aspect that ensures network security and data integrity.

RLP: Recursive Length Prefix

Ethereum uses a unique method called Recursive Length Prefix (RLP) encoding for storing and transmitting data structures. It’s essentially Ethereum’s main encoding method used to serialise objects. The Yellow Paper presents the mathematical representation and the algorithms for RLP encoding and decoding.

Contract Creation

The Yellow Paper provides an in-depth look at the process of contract creation. It explains how the data field of the transaction message is used as EVM bytecode for the account’s initialisation, where the mathematical function Λ, which represents the contract creation operation, applies.

Cryptographic Specifications

The Yellow Paper also explains the various cryptographic algorithms used in Ethereum, including the Keccak-256 hash function and the Elliptic Curve Digital Signature Algorithm (ECDSA) — (another blog on this soon, because this is crazy!). It provides the mathematical underpinnings of these algorithms and their crucial role in securing the Ethereum network.

Transaction Execution

The Yellow Paper includes a detailed algorithm for transaction execution, defined using the mathematical function Ξ. This algorithm covers validation, nonce increment, balance transfer, gas calculations, contract creation (if applicable), and message call execution.

Uncle Blocks (Ommers)

Ethereum uses a unique approach to handle network latency and maintain a more secure and fair consensus mechanism called Uncle blocks (or Ommers). The Yellow Paper provides mathematical specifications for validating and rewarding such blocks.

Fork Choice Rule

One of the essential aspects of any blockchain protocol is its fork choice rule, the mechanism by which nodes agree on a single canonical chain. Ethereum uses a ‘longest chain’ rule, preferring the chain with the highest total difficulty.

Opcode and Precompiled Contracts

The Ethereum Yellow Paper provides an exhaustive list of opcodes used by the Ethereum Virtual Machine and the gas cost associated with each. It also discusses the notion of precompiled contracts, contracts written in EVM code but are actually implemented in the client code for efficiency purposes. See OpCodes here: https://www.evm.codes/?fork=shanghai

Wow, well-done if you’ve made it this far.

Concluding

To conclude, the Ethereum Yellow Paper is a profound document that outlines the backbone of the Ethereum blockchain. From the underpinnings of the Ethereum Virtual Machine (EVM), the essential gas economics, the structure of blocks, the state transition function, to Patricia Trees, Recursive Length Prefix encoding, contract creation, cryptographic specifications, transaction execution, uncle blocks, fork choice rules, opcode lists, and precompiled contracts, it brings to light the significant facets that shape Ethereum as a decentralized platform.

The dense mathematical and technical intricacies contained within it are not for the faint-hearted. They demand a robust foundation in computer science, cryptography, and mathematics. Yet, for those who choose to venture, it offers the path to an in-depth understanding of Ethereum’s inner workings and its foundational principles.

Grasping these concepts is crucial, not only for those developing on the Ethereum platform but also for anyone deeply invested in the blockchain technology landscape. As Ethereum continues to evolve, the Ethereum Yellow Paper remains a timeless resource, testament to the ingenious design and innovative spirit at the heart of blockchain technology.

--

--