Contracts, Deployment, and the Host Trait

Today I crossed the boundary from a single-contract interpreter to a system that can create contracts, emit logs, and interact with external state.

CREATE and CREATE2

When a contract creates another contract, the EVM needs to determine the new contract's address. Here is what I found:

CREATE: $addr = keccak256 (rlp ([sender, nonce])) [12 :]$

The address depends on the sender and their nonce (a counter that increments with each CREATE). The address is predictable only if you know the current nonce.

CREATE2: $addr = keccak256 (0xff ∥ sender ∥ salt ∥ keccak256 (initcode)) [12 :]$

CREATE2 (EIP-1014) adds a salt parameter, making the address deterministic — you can compute it before deployment. This enables:

Factory patterns: deploy contracts at known addresses
Counterfactual instantiation: reference a contract before it exists
Account Abstraction: pre-compute wallet addresses

Initcode vs. Runtime Code

When deploying a contract, you don't send the runtime bytecode directly. You send initcode — a program that executes once and whose RETURN value becomes the deployed code.

Transaction: { to: null, data: initcode }
  → EVM executes initcode
  → initcode calls RETURN(offset, size)
  → bytes at memory[offset..offset+size] become the deployed code

This is why Solidity constructors can run arbitrary logic — they are part of the initcode.

The Host Trait

The interpreter needs to access external state: other contracts' code, account balances, etc. I used the Host trait to decouple the interpreter from state management:

#![allow(unused)]
fn main() {
pub trait Host {
    fn code(&self, address: &Address) -> Vec<u8>;
    fn code_size(&self, address: &Address) -> usize;
    fn balance(&self, address: &Address) -> U256;
    fn emit_log(&mut self, log: LogEntry);
    fn logs(&self) -> &[LogEntry];
}
}

For testing, I implemented MockHost with HashMaps. In production, a real Host would read from a Merkle trie.

Rust Pattern: Trait Objects

The Host trait enables inversion of control. The interpreter calls host.code(addr) without knowing whether it's backed by a HashMap (testing) or a Merkle trie (production). This is the same pattern revm uses.

LOG Opcodes

LOG0 through LOG4 — emit event logs

LOGn pops an offset and size from the stack, then pops n topic values. It reads size bytes from memory starting at offset and creates a log entry.

Before:  Stack [ ..., topic1, 32, 0x00 ]  ← offset on top, size below, then topics
         Memory[0x00..0x20] = <event data>
LOG1: pop offset (0x00), pop size (32), pop topic (topic1)
      emit log { topics: [topic1], data: memory[0x00..0x20] }
After:   Stack [ ... ]

Gas cost: $375 + 8 \times data_length + 375 \times num_topics$

Opcode	Topics	Gas (32 bytes data)
`LOG0`	0	$375 + 256 = 631$
`LOG1`	1	$375 + 256 + 375 = 1006$
`LOG2`	2	$375 + 256 + 750 = 1381$
`LOG3`	3	$375 + 256 + 1125 = 1756$
`LOG4`	4	$375 + 256 + 1500 = 2131$

Logs are not accessible from within the EVM — they are write-only. Off-chain clients read them via the JSON-RPC eth_getLogs API. I found this asymmetry easy to overlook at first.

Exercises

4.1 — CREATE address derivation is deterministic and varies with sender/nonce
4.2 — CREATE2 address derivation varies with salt and initcode
4.3 — MockHost compiles and implements all Host methods
4.4 — EXTCODESIZE returns correct values via Host
4.5 — LOG0 gas cost matches the formula
4.6 — LOG2 emits correct topics

Run: cargo test contract and cargo test host

ToyEVM

Contracts, Deployment, and the Host Trait

CREATE and CREATE2

Initcode vs. Runtime Code

The Host Trait

LOG Opcodes

LOG0 through LOG4 — emit event logs

Exercises