In the dynamic landscape of Web3, ZK-EVMs emerge as a sophisticated tool designed to implement the Ethereum Virtual Machine specification, this is what we will unpack in this post. Toposware has been collaborating with Polygon to build the most cost-effective implementation to date of a Type 1 ZK-EVM, supporting all opcodes and precompiles in the Shanghai EVM hard fork, and soon the upcoming Cancun hard fork. This means all new and existing smart contracts deployed to Ethereum and compatible chains are now able to take advantage of the power of zero-knowledge proofs.
The true strength of a ZK-EVM lies in its ability to prove computations through a robust cryptographic approach. The Toposware and Polygon teams took on the heroic engineering task of creating a custom-built EVM capable of tracing the execution of the EVM along with all state changes. This allows computation of a STARK proving that all state transitions occurred properly.
The process begins with the interpreter constructing a detailed record of all the operations performed during the transaction. This record, or execution trace, is then passed to a STARK prover, which generates a proof asserting its accuracy/validity. Although the proof is substantial in size, it is optimized for fast verification via a STARK system, targeting an Algebraic Intermediate Representation (AIR) with degree 3. Following this, the transaction proof undergoes a series of recursive SNARK proofs, each more compact than the last, which accelerates the verification process and enables the final transaction proof to be significantly more succinct than the initial one.
By focusing on individual transactions, the ZK-EVM prover can process transactions concurrently, enhancing overall efficiency by requiring less memory than proving an entire block. These transaction proofs are then aggregated in a binary tree structure until a single, unique block proof is generated. This ensures that consistency is maintained across the transactions, resulting in a verifiable proof for the entire block.
Ultimately, this cryptographic proof can stand alone or be combined with preceding block proofs, resulting in a single ZK-EVM proof that validates the entire blockchain back to genesis. This innovative approach holds the promise of a succinct, verifiable chain state, marking a significant milestone in the quest for blockchain verifiability, scalability, and integrity, and plays a central/crucial role in the Topos zkEcosystem.
We will now dive deeper into the details of the ZK-EVM that we have been building in collaboration with the Polygon Zero team!
Challenges with a Type-1 ZK-EVM
As mentioned above, the EVM was not designed with efficient succinct verification in mind, making it extremely challenging to design an efficient type-1 ZK-EVM:
- The EVM Word Size: The native EVM word size is 256 bits long, whereas plonky2 operates internally over the 64-bit Goldilocks field. This requires performing word operations on multiple smaller limbs to handle them properly internally. Because the Goldilocks field cannot hold an arbitrary 64-bit word, we split each EVM word into 8 32-bit limbs, unfortunately incurring an overhead even for simple operations like the ADD opcode.
- The EVM Fields Support: Ethereum transactions are signed over the secp256k1 curve, which adds major overhead when it comes to proving modular arithmetic, as we have to cope with modular reductions in the field of the proving system in parallel. This problem also arises in some recursive proving systems, known as “wrong-field arithmetic”. The EVM solves this by supporting precompiles for BN254 curve operations.
- The EVM Hash Function: the EVM uses Keccak hash both for state representation and for arbitrary hashing requests through the KECCAK256 opcode. While fairly efficient on a CPU, expressing KECCAK as a sequence of degree-3 constraints yields an extremely heavy AIR, while some more recent primitives tailored specifically for zero-knowledge proving systems would be orders of magnitude more efficient here. On top of that, the EVM supports additional hash functions through its precompiles (SHA2–256, RIPEMD-160 and blake2f), all of them being quite heavy for our proving system.
- The EVM State Representation: Ethereum uses Merkle Patricia Tries with RLP encoding. Both of these non-zk-friendly primitives incur a huge overhead on transaction processing within the ZK-EVM.
With those considerations in mind, let’s dive into the design of the ZK-EVM.
The ZK-EVM Design
As mentioned above, the ZK-EVM is designed to be efficiently verifiable by an AIR of degree 3. However, generating a single execution trace for a given transaction would yield a considerable overhead for the prover. Indeed, recall that the execution trace to generate a STARK proof can be assimilated to a large matrix where columns are registers and each row represents a view of the registers at a given time. From the initial register values on the first row to the final ones, each internal transition’s validity is enforced through a set of dedicated constraints.
Having a solely dedicated table for an entire EVM execution would require thousands of columns, and the matrix, while highly sparse, would still be treated by the prover as a fully dense one. Instead, we exploit the fact that some pieces of the execution can be viewed completely independently from the rest, which allows splitting the EVM trace into different STARK modules, each responsible for ensuring the integrity of their underlying computation.
These smaller STARK modules however do not operate on independent values. This requires an additional set of constraints, to enforce that values haven’t been tampered with when shared between several STARKs. For this, we use Cross-Table-Lookups (CTLs), based on the logUp argument [https://eprint.iacr.org/2022/1530.pdf] designed by Ulrich Haböck to cheaply add these copy-constraints in the overall system.
Thanks to the CTLs, we can split the ZK-EVM into a primary STARK module, the central processing unit CPU Stark, and 6 auxiliary state machines, each responsible for a set of specific operations. The CPU is then in charge of orchestrating the whole flow corresponding to the EVM transaction execution, dispatching some instructions to the secondary modules, and fetching the result. Note here that “dispatching” and “fetching” means that initial values/final values resulting from a given operation are being copied with the CTLs to/from the targeted secondary state machine.
As an example, the diagram below depicts a high-level overview of a BITWISE XOR (opcode 0x18), between two 256-bit words. The CPU is in charge of reading the inputs from the Memory module, sending them with the XOR instruction to the Logic module, and fetching from the latter the expected output, to be sent back to Memory to be written at the targeted address.
Let us dive in the different auxiliary state machines orchestrated by the CPU:
- The Arithmetic Module: This STARK aims at enforcing arithmetic operations. These include all the existing arithmetic opcodes (like ADD, MUL, SDIV, …), as well as the BYTE opcode returning the i-th word of an EVM word. Additionally, it supports custom instructions, which usage is restricted to prover mode (i.e. end-users cannot use them in their contracts) to simplify certain operations. This allows to alleviate greatly part of the overhead induced by the 2nd challenge mentioned above.
- The Keccak Module: This STARK is used to prove Keccak hash permutations and alleviate part of points 3 and 4. As Keccak internal operations are defined on bytes, the associated trace is particularly wide (it has as many registers as 20 CPUs aligned side-by-side!), which renders it particularly heavy to recursively verify when reducing the proof size.
- The KeccakSponge Module: While Keccak Stark’s sole focus is on a single Keccak permutation, we may need to hash arbitrarily long sequences of bytes (think of the EXTCODEHASH [https://ethereum.org/en/developers/docs/evm/opcodes/] opcode for instance). For this, this module acts as an intermediary orchestrator between the CPU and the Keccak modules, keeping track of the actual state of the hasher and feeding successive inputs to the Keccak module, before feeding back to the CPU the final output.
- The Logic Module: This STARK is used for BITWISE operations on EVM words (namely AND, OR and XOR opcodes). It is not heavily used in general, which allows for a layout tailored for prover efficiency, with a relatively wide trace and fairly short length.
- The Memory Module: While the CPU is in charge of handling all internal operations, the Memory STARK ensures consistency across all memory segments, as well as the state of the EVM stack. Because every operation incurs memory reads and/or writes, this makes this STARK usually the longest one and the heaviest to prove, with a trace length easily reaching a few million rows for complex contracts or heavy precompiles.
- The BytePacking Module: While the Memory usually operates on EVM words, some scenarios only deal with bytes. This is the case when reading the transaction RLP encoding, or when parsing a contract bytecode. This makes the process of memory copying overly heavy in those cases. The BytePacking module allows us to amortize this, as its name indicates, by packing sequences of at most 32 bytes into a single EVM word, and perform the reverse operations whenever we need to get back to the original bytes. This helps alleviate part of the overhead induced by the 4th point above.
This makes up for the high-level design of the ZK-EVM, focusing on a single transaction proof. In the future, we will cover in more depth the internals of the ZK-EVM, with a focus on the custom EVM interpreter and the CPU logic, how proofs are aggregated to succinctly verify an entire block, and we will delve into some of the massive optimizations that came along the way!