AVM Instruction Set
{-# OPTIONS --without-K --type-in-type --guardedness --exact-split #-} module AVM.Instruction where open import Foundation.BasicTypes hiding (ObjectId) open import Foundation.InteractionTrees
Motivation and Context
- Sources of inspiration:
Disclaimer
This post is a work in progress. It is not yet complete and may contain errors. Everything below is a proposal for defining the final instruction set of the Anoma Virtual Machine and how to define AVM programs.
What is the AVM?
AVM stands for Anoma Virtual Machine. It is a deterministic, transaction-oriented virtual machine for message-driven objects. We will unfold all these concepts here.
The concept of object is the deterministic sequential object explored in the Objects module. Briefly, an object is a function from input sequences to output sequences. In this view, an object "reacts" to messages, input sequences, and produces output sequences, possibly empty.
What is a message in the AVM?
Messages in the AVM are represented as S-expressions. The Val type defines the structure of these S-expressions. It is a sum type of integers, booleans, strings, pairs, and lists.
data Val : Set where VInt : ℕ → Val VBool : Bool → Val VString : String → Val VPair : Val → Val → Val VNil : Val VList : List Val → Val
This S-expression format allows objects to exchange structured data while maintaining a simple, uniform representation. We define semantic aliases to clarify the message-passing nature of object interaction:
Message : Set Message = Val
Input : Set Input = Message
Output : Set Output = Message
What is an object in the AVM?
The object model in the AVM is minimal. An object is a record with a history of inputs and a partial function from input sequences to output sequences.
record Object : Set where constructor mkObj field history : List Input object-type : List Input → Maybe (List Output)
Given an Object, one can query its output, transition the object to a new state by processing new input, or determine whether it can process an input, among other operations.
The AVM as a state machine
As a virtual machine, the AVM is a state machine that executes programs—sequences of instructions or events—within an ambient execution frame. Programs interact with this external environment through the execution of instructions. The environment provides the necessary context for program execution: the objects being operated upon, their inputs, and the outputs they produce.
Central to the AVM is the notion of an instruction and event. We model the AVM's instruction set as an algebraic datatype whose constructors precisely define the admissible state transitions.
With an instruction set type in place, the semantics of AVMProgram can be defined in terms of interaction trees. From these, we derive an operational semantics as a small-step transition system, enabling formal verification of properties such as determinism.
State of the AVM: The Execution Environment
Programs run on objects, and objects live in a node/controller. Think of a node/controller as a server or computer that hosts the objects, and the AVM instance as another program running on the node/controller where we can interact with the objects.
Node identifiers are aliased to NodeId.
NodeId : Set NodeId = String
Objects are identified by ObjectId. The specifics of the ObjectId type are not important for the AVM. It is just a unique identifier for an object. Say for example that the ObjectId is a pair of a NodeId and a VString.
ObjectId : Set ObjectId = NodeId × String
We will also assume that:
- all objects live in the same node (forever),
- the ambient execution frame is the node itself, and
- no other node is considered.
Future work could investigate how these assumptions could be relaxed.
Store type
As part of the execution, the AVM provides a view of the store of all objects in the AVM. However, this view is partial because an object may not yet exist in the store or may not be available. This is represented by the Store function type, where a store is a partial function from ObjectId to Object.
Store : Set Store = ObjectId -> Maybe Object
Transactional semantics
The AVM is a transactional machine. Programs can induce discrete state changes as atomic blocks, ensuring that state changes are either fully applied or have no effect at all. Therefore, some "memory" is needed in the current instance of the AVM to store these tentative state changes—precisely the inputs sent to each object. This is represented by the TxLog function type, where a TxLog is a partial function from ObjectId to Input.
TxLog : Set TxLog = ObjectId -> Maybe Input
As in TxLog, we use the TxLog prefix to denote transaction-related terms, following standard literature conventions. Note that the term "transaction" in the Anoma ecosystem has a different meaning than here: it can refer to Resource Machine transactions (the lowest level of atomicity that AVM programs ultimately compile to).
Each AVM transaction is uniquely identified by a TxId.
postulate Hash : Set TxId : Set TxId = Hash
Ctx datatype
At any given point in the execution of the AVM, the current object is the one that is being executed. The current input is the input that is being processed by the current object.
record Ctx : Set where constructor ctx field self : ObjectId input : Input traceMode : Bool -- Whether to trace the execution of the AVM. handy?
State datatype
The AVM maintains an execution state that captures all information needed for transactional computation. The state machine processes instructions by taking the current state and an input sequence as input, producing a new state, output, and execution trace.
The State record encapsulates the transactional context:
record State : Set where constructor st field store : Store -- Persistent store of committed objects L : TxLog -- Tentative write-set for the current transaction tx : Maybe TxId -- Optional identifier of the open transaction y : Ctx -- Execution context, it includes the current object and input
Each field serves a specific role in transaction processing:
storeholds objects that have been successfully committed.Laccumulates writes within the current transaction (uncommitted changes).txtracks whether a transaction is currently active (some id) or not (nothing).yprovides additional execution context, it includes the current object and input.
Note that the input to the program is present in the execution context y—this
is the input being processed by the current object.
Then, a first approximation of the signature of the state transition function is:
AVM : State -> Instruction R -> (State × R × Trace)
Here, Instruction is a type-level function from the instruction set of the AVM
to the result type of the instruction. Trace is a list of instruction-specific
AVMEvent that occurred during the execution of the instruction.
record AVMEvent : Set where constructor ev field event-name : String event-data : List Val event-result : Maybe Val event-error : Maybe Val -- you name it
Trace : Set Trace = List AVMEvent
Building on this view, the signature of the AVM state transition function is:
AVM : State -> Program R -> (State × R × Trace)
Here, Program is a set of interaction trees over the instruction set.
Now it is time to define our first candidate for the instruction set of the AVM.
Baseline ISA
The AVM instruction set architecture (ISA) defines the primitive operations available to programs. Instructions are modeled as an inductive datatype where each constructor specifies its return type. ISA is the type-level function signature of the instruction set.
ISA : Set ISA = Set -> Set
The AVM requires additional machinery to handle introspection and debugging. We need to track the safety of operations—that is, whether they are safe to use in a safe context.
Safety level for operations
Two safety levels are defined: Safe and Unsafe.
data Safety : Set where Safe : Safety Unsafe : Safety
Filter : Set Filter = ObjectId -> Bool
Dynamictype: a dynamic metadata representation.
postulate Dynamic : Set -- Dynamic metadata representation
The baseline instruction set Instr₀ provides ten primitive operations:
data Instr₀ : Safety -> ISA where -- Transaction management beginTx : Instr₀ Safe TxId -- Start new transaction, returns its ID commitTx : TxId → Instr₀ Safe Bool -- Commit transaction, returns success abortTx : TxId → Instr₀ Safe ⊤ -- Abort transaction, always succeeds -- Object lifecycle createObj : Input → Instr₀ Safe ObjectId -- Create object from input destroyObj : ObjectId → Instr₀ Safe ⊤ -- Remove object from store -- Object interaction (pure message-passing) call : ObjectId → Input → Instr₀ Safe Output -- Send input, await output -- Safe introspection self : Instr₀ Safe ObjectId -- Get current object ID input : Instr₀ Safe Input -- Receive external input scry : Filter → Instr₀ Unsafe (List ObjectId) -- Query ALL objects reflect : ObjectId → Instr₀ Unsafe Dynamic -- Inspect ANY object
Operations are marked as unsafe when they violate core principles of the object model or pose systemic risks. Unsafe operations break encapsulation by accessing object internals without going through the object's interface, as is the case with reflect. They bypass access control by allowing unrestricted inspection or discovery of objects, potentially exposing sensitive information or implementation details. They also risk unbounded computation by triggering expensive global operations that scale with the total number of objects in the system.
Consider scry: it applies a filter predicate across all objects in the store. In a large system with millions of millions of objects, even a simple filter could cause significant performance degradation. A malicious or poorly written filter (e.g., one that always returns true) could force enumeration of the entire object store.
Similarly, reflect allows external inspection of any object's metadata without that object's consent, breaking the principle that objects should control what information they expose by processing a message—which is not the case with reflect.
Safe programs cannot use unsafe operations. This design enforces at compile time that safe programs remain isolated from potentially dangerous system-level operations. These safe programs constitute what we formally define as "AVM programs"—the standard execution model for user-level code in the AVM.
AVMProgram : Set → Set AVMProgram = ITree (Instr₀ Safe)
And, knowing the risks of unsafe operations, we can define a SystemProgram as any program that can use unsafe operations.
SystemProgram : Set → Set SystemProgram A = ITree (λ B → Σ Safety (λ s → Instr₀ s B)) A
Instruction Set Semantics
The Instr₀ instruction set comprises primitive operations organized into four categories: transaction control, object lifecycle, object interaction, and introspection. Each instruction includes its safety level in the type signature. With Instr₀, we have the syntax, but not the semantics, what happens when we execute them.
We can define a small-step transition system for the instruction set in Agda to provide formal semantics for each operation. This would establish precise operational behavior through state transitions. We leave this formalization for future work. Below, we describe the intended semantics of each instruction in natural language for now.
Transaction Control
Transaction control instructions manage atomic execution contexts. All state modifications within a transaction are tentative until committed.
beginTx
beginTx : Instr₀ Safe TxId
Initiates a new transactional context and returns a fresh transaction identifier. All subsequent state modifications are logged to the transaction's write-set until the transaction is either committed or aborted. Transactions provide atomicity: either all changes succeed or none do.
commitTx
commitTx : TxId → Instr₀ Safe Bool
Attempts to commit the transaction identified by the given TxId. It is a synchronous operation that returns a boolean. Returns true if the commit succeeds, persisting all logged changes to the store. Returns false if the transaction cannot be committed (e.g., conflicts with concurrent transactions or the transaction was already finalized).
abortTx
abortTx : TxId → Instr₀ Safe ⊤
Aborts the transaction identified by the given TxId, discarding all tentative state changes in its write-set. The store reverts to its state before beginTx was called. This operation always succeeds and returns unit.
Object Lifecycle
Object lifecycle instructions manage the creation and destruction of objects in the store.
createObj
createObj : Input → Instr₀ Safe ObjectId
Creates a new object in the store from an Input. The Input may be used to specify the object's kind, initial program code, and metadata. Returns a fresh ObjectId that uniquely identifies the new object. Object creation is transactional: if the enclosing transaction aborts, the object is not created.
destroyObj
destroyObj : ObjectId → Instr₀ Safe ⊤
Marks the object identified by the given ObjectId for destruction. The object is removed from the store, and subsequent references to this ObjectId will fail. Destruction is transactional: the object remains accessible within the current transaction until committed.
Object Interaction
Object interaction is achieved through pure message-passing, preserving object encapsulation. Message passing is the only way to interact with objects in the AVM.
call
call : ObjectId → Input → Instr₀ Safe Output
Performs synchronous message passing to the object identified by the given ObjectId. Sends the input and blocks until the target object produces an output. The output is returned to the caller.
Introspection and Context
Introspection instructions query the execution environment and object metadata.
self
self : Instr₀ Safe ObjectId
Returns the ObjectId of the currently executing object. This instruction is essential for recursion and self-reference, allowing an object to pass its own identifier to other objects or invoke itself. See also the use of self in defining purely functional resources in the AVM.
input
input : Instr₀ Safe Input
This instruction blocks until input is available from the caller or external context. This is the input being processed by the current object from Ctx.
scry (Unsafe)
scry : Filter → Instr₀ Unsafe (List ObjectId)
Queries the store for objects matching the given predicate Filter. Returns a list of ObjectId for all objects satisfying the filter criteria. Marked as unsafe because it can enumerate all objects in the system, potentially causing performance degradation.
reflect (Unsafe)
reflect : ObjectId → Instr₀ Unsafe Dynamic
Retrieves metadata about the object identified by the given ObjectId. Returns structured metadata. Marked as unsafe because it bypasses object encapsulation.
Bytecode Encoding
The typed Instr₀ representation provides type safety and enables formal reasoning, but practical AVM implementations require a serializable bytecode format for storage, transmission, and execution. The bytecode encoding and decoding mechanisms are defined in the Bytecode module, which provides:
- Untyped bytecode representation with opcodes
- Encoding functions from typed instructions to bytecode
- Decoding with runtime type checking
- Program serialization
See AVM Bytecode Encoding for the complete specification.
Examples
Transfer
This example demonstrates an atomic transfer operation between two account objects. We assume the following context:
- An Account object type is defined.
- Account objects support
withdrawanddepositmethods that modify a balance field - Two Account objects are identified by
fromAccandtoAcc
We define helper functions for creating messages:
withdraw : ℕ → Message withdraw n = VList (VString "withdraw" ∷ VInt n ∷ [])
deposit : ℕ → Message deposit n = VList (VString "deposit" ∷ VInt n ∷ [])
ok? : Val → Bool ok? (VString "ok") = true ok? (VString _) = false -- Any other string ok? (VInt _) = false ok? (VBool _) = false ok? (VPair _ _) = false ok? VNil = false ok? (VList _) = false
Given two object identifiers and an amount, we define a transfer function with type signature:
kudosTransfer : ObjectId → ObjectId → ℕ → AVMProgram Val
The implementation ensures atomicity through explicit transaction management:
kudosTransfer fromAcc toAcc amount = trigger beginTx >>= λ txId → trigger (call fromAcc (withdraw amount)) >>= λ r₁ → (if ok? r₁ then (trigger (call toAcc (deposit amount)) >>= λ r₂ → if ok? r₂ then (trigger (commitTx txId) >>= λ success → if success then ret (VString "transfer-complete") else ret (VString "commit-failed")) else (trigger (abortTx txId) >>= λ _ → ret (VString "deposit-failed"))) else (trigger (abortTx txId) >>= λ _ → ret (VString "insufficient-funds")))
The transfer operation is atomic: either both the withdrawal and deposit succeed and are committed, or the transaction is aborted and no state changes occur.
Example: Object Creation and Initialization
Creating a new counter object and performing initial operations:
createCounter : Val → AVMProgram ObjectId createCounter initialValue = trigger (createObj (VList (VString "counter" ∷ initialValue ∷ []))) >>= λ counterId → trigger (call counterId (VString "increment")) >>= λ _ → ret counterId -- Using the counter counterExample : AVMProgram Val counterExample = createCounter (VInt 0) >>= λ ctr → trigger (call ctr (VString "increment")) >>= λ _ → trigger (call ctr (VString "increment")) >>= λ _ → trigger (call ctr (VString "get")) >>= λ v₃ → ret v₃ -- Returns VInt 3
The counter maintains its state across calls, demonstrating the stateful nature of objects created through createObj.
Example: Balance Update with Validation
Updating an account balance through message-passing with validation:
updateBalance : ObjectId → Val → AVMProgram Val updateBalance account v = helper v where invalidMsg : Val invalidMsg = VString "invalid-value-type" helper : Val → AVMProgram Val helper (VInt n) = -- Validate non-negative (if 0 ≤? n then -- Get current balance via message (trigger (call account (VString "get-balance")) >>= λ oldValue → -- Update via message trigger (call account (VList (VString "set-balance" ∷ VInt n ∷ []))) >>= λ result → if ok? result then ret (VString "balance-updated") else ret (VString "update-failed")) else ret (VString "invalid-balance")) helper (VBool _) = ret invalidMsg helper (VString _) = ret invalidMsg helper (VPair _ _) = ret invalidMsg helper VNil = ret invalidMsg helper (VList _) = ret invalidMsg
Balance updates are validated before being applied through the object's message interface.
Batch transfer
Calling a multi-method that operates on multiple objects simultaneously:
transferBatch : List (ObjectId × ObjectId × ℕ) → AVMProgram Val transferBatch transfers = processTransfers transfers where processTransfers : List (ObjectId × ObjectId × ℕ) → AVMProgram Val processTransfers [] = ret (VString "batch-complete") processTransfers ((from , to , amount) ∷ rest) = kudosTransfer from to amount >>= λ result → (if ok? result then processTransfers rest else ret (VString "batch-failed"))
Batch operations can be made atomic by wrapping them in appropriate transaction boundaries.
Example: Object Destruction with Cleanup
Destroying an object after transferring its remaining balance:
closeAccount : ObjectId → ObjectId → AVMProgram Val closeAccount accountToClose beneficiary = -- Get final balance trigger (call accountToClose (VString "get-balance")) >>= λ balance → closeWithBalance balance where closeWithBalance : Val → AVMProgram Val closeWithBalance (VInt n) = if 0 <? n then -- Transfer remaining funds (trigger (call accountToClose (withdraw n)) >>= λ _ → trigger (call beneficiary (deposit n)) >>= λ _ → -- Destroy the account trigger (destroyObj accountToClose) >>= λ _ → ret (VString "account-closed")) else -- No balance, just destroy (trigger (destroyObj accountToClose) >>= λ _ → ret (VString "account-closed-empty")) closeWithBalance (VString _) = ret (VString "invalid-balance") closeWithBalance (VBool _) = ret (VString "invalid-balance") closeWithBalance (VPair _ _) = ret (VString "invalid-balance") closeWithBalance VNil = ret (VString "invalid-balance") closeWithBalance (VList _) = ret (VString "invalid-balance")
This ensures no funds are lost when an account is closed - the remaining balance is transferred before destruction using destroyObj.
What's next?
We just defined the instruction set syntax. This is the first iteration. However, we need to define an interpreter for the instruction set. Otherwise, how exactly does the execution of a program actually work?
interpreter : State -> Instr₀ Safe -> (State × Output × Trace)
Also, from here, it should be relatively easy to prove:
- atomicity (all-or-nothing per tx)
- determinism
References to other modules
This page references the following modules:
- Foundation.BasicTypes
- Symbols: ×, +, ⊥, Bool, ≡, Dec, Maybe, ℕ, List, ∈, Σ, ⊤, ↔, RawMonad, ,, ,Σ, inl, inr, tt, true, false, refl, yes, no, nothing, just, zero, suc, [], ∷, here, there, mk↔, fst, snd, ∃, ∃-syntax, ⊥-elim, ¬, not, sym, trans, cong, just≢nothing, just-injective, map-maybe, length, ++-assoc, ++-right-id, map, foldr, FinSet, Pred, case_of_, ∧, ∨, if_then_else_, ≢, begin_, ≡⟨⟩, ≡⟨⟩_, _∎, >>=ᴹ, +ℕ, *ℕ, ≥, ≤, ≤?, <?, ≟ℕ, ++, ∈Pred, ⊆, ∘, >>
- Goose.Anoma.Identities
- Symbols: NodeId, ObjectId, IDParams, Backend, Capabilities, Ed25519, Secp256k1, BLS, localMemory, localConnection, remoteConnection, commit, commitAndDecrypt, Ciphertext, Plaintext, Signable, ExternalId, InternalId, Identity, Commitment, EngineName, ExternalIdentity, EngineId, isLocalEngineID, isRemoteEngineID
Symbol disambiguation log
The following references had multiple matches and were disambiguated:
Tx: choseAVM.Instruction.TxLogfrom ['AVM.Instruction.TxLog', 'AVM.Instruction.TxId', 'AVM.Instruction.beginTx']
Module References
Referenced By
This module is referenced by:
- AVM.Bytecode
- AVM.Objects
References
This module references:
- Foundation.BasicTypes
- Foundation.InteractionTrees
?>