Contracts, Deployment, and the Host Trait
Today I crossed the boundary from a single-contract interpreter to a system that can create contracts, emit logs, and interact with external state.
CREATE and CREATE2
When a contract creates another contract, the EVM needs to determine the new contract's address. Here is what I found:
CREATE:
The address depends on the sender and their nonce (a counter that increments with each CREATE). The address is predictable only if you know the current nonce.
CREATE2:
CREATE2 (EIP-1014) adds a salt parameter, making the address deterministic — you can compute it before deployment. This enables:
- Factory patterns: deploy contracts at known addresses
- Counterfactual instantiation: reference a contract before it exists
- Account Abstraction: pre-compute wallet addresses
Initcode vs. Runtime Code
When deploying a contract, you don't send the runtime bytecode directly. You send initcode — a program that executes once and whose RETURN value becomes the deployed code.
Transaction: { to: null, data: initcode }
→ EVM executes initcode
→ initcode calls RETURN(offset, size)
→ bytes at memory[offset..offset+size] become the deployed code
This is why Solidity constructors can run arbitrary logic — they are part of the initcode.
The Host Trait
The interpreter needs to access external state: other contracts' code, account balances, etc. I used the Host trait to decouple the interpreter from state management:
#![allow(unused)] fn main() { pub trait Host { fn code(&self, address: &Address) -> Vec<u8>; fn code_size(&self, address: &Address) -> usize; fn balance(&self, address: &Address) -> U256; fn emit_log(&mut self, log: LogEntry); fn logs(&self) -> &[LogEntry]; } }
For testing, I implemented MockHost with HashMaps. In production, a real Host would read from a Merkle trie.
The Host trait enables inversion of control. The interpreter calls host.code(addr) without knowing whether it's backed by a HashMap (testing) or a Merkle trie (production). This is the same pattern revm uses.
LOG Opcodes
LOG0 through LOG4 — emit event logs
LOGn pops an offset and size from the stack, then pops n topic values. It reads size bytes from memory starting at offset and creates a log entry.
Before: Stack [ ..., topic1, 32, 0x00 ] ← offset on top, size below, then topics
Memory[0x00..0x20] = <event data>
LOG1: pop offset (0x00), pop size (32), pop topic (topic1)
emit log { topics: [topic1], data: memory[0x00..0x20] }
After: Stack [ ... ]
Gas cost:
| Opcode | Topics | Gas (32 bytes data) |
|---|---|---|
LOG0 | 0 | |
LOG1 | 1 | |
LOG2 | 2 | |
LOG3 | 3 | |
LOG4 | 4 |
Logs are not accessible from within the EVM — they are write-only. Off-chain clients read them via the JSON-RPC eth_getLogs API. I found this asymmetry easy to overlook at first.
Exercises
- 4.1 — CREATE address derivation is deterministic and varies with sender/nonce
- 4.2 — CREATE2 address derivation varies with salt and initcode
- 4.3 — MockHost compiles and implements all Host methods
- 4.4 — EXTCODESIZE returns correct values via Host
- 4.5 — LOG0 gas cost matches the formula
- 4.6 — LOG2 emits correct topics
Run: cargo test contract and cargo test host