Skip to content

AVM Instruction Set

{-# OPTIONS --without-K --type-in-type --guardedness --exact-split #-}
module AVM.Instruction where

open import Foundation.BasicTypes hiding (ObjectId)
open import Foundation.InteractionTrees

Motivation and Context

Disclaimer

This post is a work in progress. It is not yet complete and may contain errors. Everything below is a proposal for defining the final instruction set of the Anoma Virtual Machine and how to define AVM programs.

What is the AVM?

AVM stands for Anoma Virtual Machine. It is a deterministic, transaction-oriented virtual machine for message-driven objects. We will unfold all these concepts here.

The concept of object is the deterministic sequential object explored in the Objects module. Briefly, an object is a function from input sequences to output sequences. In this view, an object "reacts" to messages, input sequences, and produces output sequences, possibly empty.

What is a message in the AVM?

Messages in the AVM are represented as S-expressions. The Val type defines the structure of these S-expressions. It is a sum type of integers, booleans, strings, pairs, and lists.

data Val : Set where
  VInt :   Val
  VBool : Bool  Val
  VString : String  Val
  VPair : Val  Val  Val
  VNil : Val
  VList : List Val  Val

This S-expression format allows objects to exchange structured data while maintaining a simple, uniform representation. We define semantic aliases to clarify the message-passing nature of object interaction:

Message : Set
Message = Val
  • Input is the type of inputs to an object.
  • Output is the type of outputs from an object.
Input : Set
Input = Message
Output : Set
Output = Message

What is an object in the AVM?

The object model in the AVM is minimal. An object is a record with a history of inputs and a partial function from input sequences to output sequences.

record Object : Set where
  constructor mkObj
  field
    history : List Input
    object-type : List Input  Maybe (List Output)

Given an Object, one can query its output, transition the object to a new state by processing new input, or determine whether it can process an input, among other operations.

The AVM as a state machine

As a virtual machine, the AVM is a state machine that executes programs—sequences of instructions or events—within an ambient execution frame. Programs interact with this external environment through the execution of instructions. The environment provides the necessary context for program execution: the objects being operated upon, their inputs, and the outputs they produce.

Central to the AVM is the notion of an instruction and event. We model the AVM's instruction set as an algebraic datatype whose constructors precisely define the admissible state transitions.

With an instruction set type in place, the semantics of AVMProgram can be defined in terms of interaction trees. From these, we derive an operational semantics as a small-step transition system, enabling formal verification of properties such as determinism.

State of the AVM: The Execution Environment

Programs run on objects, and objects live in a node/controller. Think of a node/controller as a server or computer that hosts the objects, and the AVM instance as another program running on the node/controller where we can interact with the objects.

Node identifiers are aliased to NodeId.

NodeId : Set
NodeId = String

Objects are identified by ObjectId. The specifics of the ObjectId type are not important for the AVM. It is just a unique identifier for an object. Say for example that the ObjectId is a pair of a NodeId and a VString.

ObjectId : Set
ObjectId = NodeId × String

We will also assume that:

  • all objects live in the same node (forever),
  • the ambient execution frame is the node itself, and
  • no other node is considered.

Future work could investigate how these assumptions could be relaxed.

Store type

As part of the execution, the AVM provides a view of the store of all objects in the AVM. However, this view is partial because an object may not yet exist in the store or may not be available. This is represented by the Store function type, where a store is a partial function from ObjectId to Object.

Store : Set
Store = ObjectId -> Maybe Object

Transactional semantics

The AVM is a transactional machine. Programs can induce discrete state changes as atomic blocks, ensuring that state changes are either fully applied or have no effect at all. Therefore, some "memory" is needed in the current instance of the AVM to store these tentative state changes—precisely the inputs sent to each object. This is represented by the TxLog function type, where a TxLog is a partial function from ObjectId to Input.

TxLog : Set
TxLog = ObjectId -> Maybe Input

As in TxLog, we use the TxLog prefix to denote transaction-related terms, following standard literature conventions. Note that the term "transaction" in the Anoma ecosystem has a different meaning than here: it can refer to Resource Machine transactions (the lowest level of atomicity that AVM programs ultimately compile to).

Each AVM transaction is uniquely identified by a TxId.

postulate
  Hash : Set

TxId : Set
TxId = Hash

Ctx datatype

At any given point in the execution of the AVM, the current object is the one that is being executed. The current input is the input that is being processed by the current object.

record Ctx : Set where
  constructor ctx
  field
    self : ObjectId
    input : Input
    traceMode : Bool -- Whether to trace the execution of the AVM. handy?

State datatype

The AVM maintains an execution state that captures all information needed for transactional computation. The state machine processes instructions by taking the current state and an input sequence as input, producing a new state, output, and execution trace.

The State record encapsulates the transactional context:

record State : Set where
  constructor st
  field
    store : Store      -- Persistent store of committed objects
    L : TxLog          -- Tentative write-set for the current transaction
    tx : Maybe TxId    -- Optional identifier of the open transaction
    y : Ctx            -- Execution context, it includes the current object and input

Each field serves a specific role in transaction processing:

  • store holds objects that have been successfully committed.
  • L accumulates writes within the current transaction (uncommitted changes).
  • tx tracks whether a transaction is currently active (some id) or not (nothing).
  • y provides additional execution context, it includes the current object and input.

Note that the input to the program is present in the execution context y—this is the input being processed by the current object.

Then, a first approximation of the signature of the state transition function is:

  AVM : State -> Instruction R -> (State × R × Trace)

Here, Instruction is a type-level function from the instruction set of the AVM to the result type of the instruction. Trace is a list of instruction-specific AVMEvent that occurred during the execution of the instruction.

record AVMEvent : Set where
  constructor ev
  field
    event-name : String
    event-data : List Val
    event-result : Maybe Val
    event-error : Maybe Val
    -- you name it
Trace : Set
Trace = List AVMEvent

Building on this view, the signature of the AVM state transition function is:

AVM : State -> Program R -> (State × R × Trace)

Here, Program is a set of interaction trees over the instruction set. Now it is time to define our first candidate for the instruction set of the AVM.

Baseline ISA

The AVM instruction set architecture (ISA) defines the primitive operations available to programs. Instructions are modeled as an inductive datatype where each constructor specifies its return type. ISA is the type-level function signature of the instruction set.

ISA : Set
ISA = Set -> Set

The AVM requires additional machinery to handle introspection and debugging. We need to track the safety of operations—that is, whether they are safe to use in a safe context.

Safety level for operations

Two safety levels are defined: Safe and Unsafe.

data Safety : Set where
  Safe   : Safety 
  Unsafe : Safety 
Filter : Set
Filter = ObjectId -> Bool
  • Dynamic type: a dynamic metadata representation.
postulate
  Dynamic : Set  -- Dynamic metadata representation

The baseline instruction set Instr₀ provides ten primitive operations:

data Instr₀ : Safety -> ISA where
  -- Transaction management
  beginTx   : Instr₀ Safe TxId               -- Start new transaction, returns its ID
  commitTx  : TxId  Instr₀ Safe Bool        -- Commit transaction, returns success
  abortTx   : TxId  Instr₀ Safe            -- Abort transaction, always succeeds

  -- Object lifecycle
  createObj : Input  Instr₀ Safe ObjectId   -- Create object from input
  destroyObj : ObjectId  Instr₀ Safe       -- Remove object from store

  -- Object interaction (pure message-passing)
  call      : ObjectId  Input  Instr₀ Safe Output   -- Send input, await output

  -- Safe introspection
  self      : Instr₀ Safe ObjectId                    -- Get current object ID
  input     : Instr₀ Safe Input                       -- Receive external input

  scry    : Filter  Instr₀ Unsafe (List ObjectId)  -- Query ALL objects
  reflect : ObjectId  Instr₀ Unsafe Dynamic        -- Inspect ANY object

Operations are marked as unsafe when they violate core principles of the object model or pose systemic risks. Unsafe operations break encapsulation by accessing object internals without going through the object's interface, as is the case with reflect. They bypass access control by allowing unrestricted inspection or discovery of objects, potentially exposing sensitive information or implementation details. They also risk unbounded computation by triggering expensive global operations that scale with the total number of objects in the system.

Consider scry: it applies a filter predicate across all objects in the store. In a large system with millions of millions of objects, even a simple filter could cause significant performance degradation. A malicious or poorly written filter (e.g., one that always returns true) could force enumeration of the entire object store.

Similarly, reflect allows external inspection of any object's metadata without that object's consent, breaking the principle that objects should control what information they expose by processing a message—which is not the case with reflect.

Safe programs cannot use unsafe operations. This design enforces at compile time that safe programs remain isolated from potentially dangerous system-level operations. These safe programs constitute what we formally define as "AVM programs"—the standard execution model for user-level code in the AVM.

AVMProgram : Set  Set
AVMProgram = ITree (Instr₀ Safe)

And, knowing the risks of unsafe operations, we can define a SystemProgram as any program that can use unsafe operations.

SystemProgram : Set  Set
SystemProgram A = ITree  B  Σ Safety  s  Instr₀ s B)) A

Instruction Set Semantics

The Instr₀ instruction set comprises primitive operations organized into four categories: transaction control, object lifecycle, object interaction, and introspection. Each instruction includes its safety level in the type signature. With Instr₀, we have the syntax, but not the semantics, what happens when we execute them.

We can define a small-step transition system for the instruction set in Agda to provide formal semantics for each operation. This would establish precise operational behavior through state transitions. We leave this formalization for future work. Below, we describe the intended semantics of each instruction in natural language for now.

Transaction Control

Transaction control instructions manage atomic execution contexts. All state modifications within a transaction are tentative until committed.

beginTx

beginTx : Instr₀ Safe TxId

Initiates a new transactional context and returns a fresh transaction identifier. All subsequent state modifications are logged to the transaction's write-set until the transaction is either committed or aborted. Transactions provide atomicity: either all changes succeed or none do.

commitTx

commitTx : TxId → Instr₀ Safe Bool

Attempts to commit the transaction identified by the given TxId. It is a synchronous operation that returns a boolean. Returns true if the commit succeeds, persisting all logged changes to the store. Returns false if the transaction cannot be committed (e.g., conflicts with concurrent transactions or the transaction was already finalized).

abortTx

abortTx : TxId → Instr₀ Safe ⊤

Aborts the transaction identified by the given TxId, discarding all tentative state changes in its write-set. The store reverts to its state before beginTx was called. This operation always succeeds and returns unit.

Object Lifecycle

Object lifecycle instructions manage the creation and destruction of objects in the store.

createObj

createObj : Input → Instr₀ Safe ObjectId

Creates a new object in the store from an Input. The Input may be used to specify the object's kind, initial program code, and metadata. Returns a fresh ObjectId that uniquely identifies the new object. Object creation is transactional: if the enclosing transaction aborts, the object is not created.

destroyObj

destroyObj : ObjectId → Instr₀ Safe ⊤

Marks the object identified by the given ObjectId for destruction. The object is removed from the store, and subsequent references to this ObjectId will fail. Destruction is transactional: the object remains accessible within the current transaction until committed.

Object Interaction

Object interaction is achieved through pure message-passing, preserving object encapsulation. Message passing is the only way to interact with objects in the AVM.

call

call : ObjectId → Input → Instr₀ Safe Output

Performs synchronous message passing to the object identified by the given ObjectId. Sends the input and blocks until the target object produces an output. The output is returned to the caller.

Introspection and Context

Introspection instructions query the execution environment and object metadata.

self

self : Instr₀ Safe ObjectId

Returns the ObjectId of the currently executing object. This instruction is essential for recursion and self-reference, allowing an object to pass its own identifier to other objects or invoke itself. See also the use of self in defining purely functional resources in the AVM.

input

input : Instr₀ Safe Input

This instruction blocks until input is available from the caller or external context. This is the input being processed by the current object from Ctx.

scry (Unsafe)

scry : Filter → Instr₀ Unsafe (List ObjectId)

Queries the store for objects matching the given predicate Filter. Returns a list of ObjectId for all objects satisfying the filter criteria. Marked as unsafe because it can enumerate all objects in the system, potentially causing performance degradation.

reflect (Unsafe)

reflect : ObjectId → Instr₀ Unsafe Dynamic

Retrieves metadata about the object identified by the given ObjectId. Returns structured metadata. Marked as unsafe because it bypasses object encapsulation.

Bytecode Encoding

The typed Instr₀ representation provides type safety and enables formal reasoning, but practical AVM implementations require a serializable bytecode format for storage, transmission, and execution. The bytecode encoding and decoding mechanisms are defined in the Bytecode module, which provides:

  • Untyped bytecode representation with opcodes
  • Encoding functions from typed instructions to bytecode
  • Decoding with runtime type checking
  • Program serialization

See AVM Bytecode Encoding for the complete specification.

Examples

Transfer

This example demonstrates an atomic transfer operation between two account objects. We assume the following context:

  • An Account object type is defined.
  • Account objects support withdraw and deposit methods that modify a balance field
  • Two Account objects are identified by fromAcc and toAcc

We define helper functions for creating messages:

withdraw :   Message
withdraw n = VList (VString "withdraw"  VInt n  [])
deposit :   Message
deposit n = VList (VString "deposit"  VInt n  [])
ok? : Val  Bool
ok? (VString "ok") = true
ok? (VString _) = false  -- Any other string
ok? (VInt _) = false
ok? (VBool _) = false
ok? (VPair _ _) = false
ok? VNil = false
ok? (VList _) = false

Given two object identifiers and an amount, we define a transfer function with type signature:

kudosTransfer : ObjectId  ObjectId    AVMProgram Val

The implementation ensures atomicity through explicit transaction management:

kudosTransfer fromAcc toAcc amount =
  trigger beginTx >>= λ txId 
  trigger (call fromAcc (withdraw amount)) >>= λ r₁ 
  (if ok? r₁ then
    (trigger (call toAcc (deposit amount)) >>= λ r₂ 
     if ok? r₂ then
       (trigger (commitTx txId) >>= λ success 
        if success then
          ret (VString "transfer-complete")
        else
          ret (VString "commit-failed"))
     else
       (trigger (abortTx txId) >>= λ _ 
        ret (VString "deposit-failed")))
  else
    (trigger (abortTx txId) >>= λ _ 
     ret (VString "insufficient-funds")))

The transfer operation is atomic: either both the withdrawal and deposit succeed and are committed, or the transaction is aborted and no state changes occur.

Example: Object Creation and Initialization

Creating a new counter object and performing initial operations:

createCounter : Val  AVMProgram ObjectId
createCounter initialValue =
  trigger (createObj (VList (VString "counter"  initialValue  []))) >>= λ counterId 
  trigger (call counterId (VString "increment")) >>= λ _ 
  ret counterId

-- Using the counter
counterExample : AVMProgram Val
counterExample =
  createCounter (VInt 0) >>= λ ctr 
  trigger (call ctr (VString "increment")) >>= λ _ 
  trigger (call ctr (VString "increment")) >>= λ _ 
  trigger (call ctr (VString "get")) >>= λ v₃ 
  ret v₃  -- Returns VInt 3

The counter maintains its state across calls, demonstrating the stateful nature of objects created through createObj.

Example: Balance Update with Validation

Updating an account balance through message-passing with validation:

updateBalance : ObjectId  Val  AVMProgram Val
updateBalance account v = helper v
  where
    invalidMsg : Val
    invalidMsg = VString "invalid-value-type"

    helper : Val  AVMProgram Val
    helper (VInt n) =
      -- Validate non-negative
      (if 0 ≤? n then
        -- Get current balance via message
        (trigger (call account (VString "get-balance")) >>= λ oldValue 
         -- Update via message
         trigger (call account (VList (VString "set-balance"  VInt n  []))) >>= λ result 
         if ok? result then
           ret (VString "balance-updated")
         else
           ret (VString "update-failed"))
      else
        ret (VString "invalid-balance"))
    helper (VBool _) = ret invalidMsg
    helper (VString _) = ret invalidMsg
    helper (VPair _ _) = ret invalidMsg
    helper VNil = ret invalidMsg
    helper (VList _) = ret invalidMsg

Balance updates are validated before being applied through the object's message interface.

Batch transfer

Calling a multi-method that operates on multiple objects simultaneously:

transferBatch : List (ObjectId × ObjectId × )  AVMProgram Val
transferBatch transfers =
  processTransfers transfers
  where
    processTransfers : List (ObjectId × ObjectId × )  AVMProgram Val
    processTransfers [] = ret (VString "batch-complete")
    processTransfers ((from , to , amount)  rest) =
      kudosTransfer from to amount >>= λ result 
      (if ok? result then
        processTransfers rest
      else
        ret (VString "batch-failed"))

Batch operations can be made atomic by wrapping them in appropriate transaction boundaries.

Example: Object Destruction with Cleanup

Destroying an object after transferring its remaining balance:

closeAccount : ObjectId  ObjectId  AVMProgram Val
closeAccount accountToClose beneficiary =
  -- Get final balance
  trigger (call accountToClose (VString "get-balance")) >>= λ balance 
  closeWithBalance balance
  where
    closeWithBalance : Val  AVMProgram Val
    closeWithBalance (VInt n) =
      if 0 <? n then
        -- Transfer remaining funds
        (trigger (call accountToClose (withdraw n)) >>= λ _ 
         trigger (call beneficiary (deposit n)) >>= λ _ 
         -- Destroy the account
         trigger (destroyObj accountToClose) >>= λ _ 
         ret (VString "account-closed"))
      else
        -- No balance, just destroy
        (trigger (destroyObj accountToClose) >>= λ _ 
         ret (VString "account-closed-empty"))
    closeWithBalance (VString _) = ret (VString "invalid-balance")
    closeWithBalance (VBool _) = ret (VString "invalid-balance")
    closeWithBalance (VPair _ _) = ret (VString "invalid-balance")
    closeWithBalance VNil = ret (VString "invalid-balance")
    closeWithBalance (VList _) = ret (VString "invalid-balance")

This ensures no funds are lost when an account is closed - the remaining balance is transferred before destruction using destroyObj.

What's next?

We just defined the instruction set syntax. This is the first iteration. However, we need to define an interpreter for the instruction set. Otherwise, how exactly does the execution of a program actually work?

interpreter : State -> Instr₀ Safe -> (State × Output × Trace)

Also, from here, it should be relatively easy to prove:

  • atomicity (all-or-nothing per tx)
  • determinism

References to other modules

This page references the following modules:


Symbol disambiguation log

The following references had multiple matches and were disambiguated:

  • Tx: chose AVM.Instruction.TxLog from ['AVM.Instruction.TxLog', 'AVM.Instruction.TxId', 'AVM.Instruction.beginTx']

Module References

Referenced By

This module is referenced by:

References

This module references:

?>