Skip to content

AVM Instruction Set

{-# OPTIONS --without-K --type-in-type --guardedness --exact-split #-}
open import Foundation.BasicTypes

Work in Progress

This document is a work in progress. It is not yet complete and may contain inaccuracies. Everything below is a proposal for defining the final instruction set of the Anoma Virtual Machine which will implicitly define AVM programs.

Module Parameters

The module is parameterized by Val which serves as both Input and Output types in the current implementation. The I-O object model (φ : I* ⇀ O) permits distinct types, but this implementation uses Val for both to simplify the architecture. Future versions may introduce separate Input and Output type parameters.

module AVM.Instruction
    -- Core types
    (Val : Set)                      -- Used for both Input and Output currently
    (ObjectId : Set)
    (freshObjectId :   ObjectId)

    -- Machine/distribution types
    (MachineId : Set)

    -- Controller/distribution types
    (ControllerId : Set)
    (ControllerVersion : Set)

    -- Transaction types
    (TxId : Set)
    (freshTxId :   TxId)

    -- Object model type
    (Object : Set)
  where

-- Import and re-export Runtime types for convenience
-- Runtime defines: Input = Output = Val
open import AVM.Runtime Val ObjectId freshObjectId MachineId ControllerId
                        ControllerVersion TxId freshTxId Object public

Motivation and Context

What is the AVM?

AVM stands for Anoma Virtual Machine. It is a transactional virtual machine for message-driven objects that supports pure functions, distributed computation, and optional nondeterministic choice for intent matching.

We will unfold all these concepts here.

What is a message in the AVM?

Message passing is the only way to communicate between objects. The representation of messages is left abstract by design. That is the type Val in this module. One can instantiate this module with different types of messages, such as S-expressions, JSON, or any other data format.

We define semantic aliases for messages to clarify the message-passing nature of object interaction. These types are defined in the Runtime module.

What is an object in the AVM?

The AVM instruction set is parameterized by an abstract object model. This module does not specify object internals; it treats objects as opaque entities identified by ObjectId. The Object parameter abstracts the object's internal representation.

Different object models may instantiate this parameter. One implementation uses sequential objects where each object's behavior is a pure function mapping input sequences to output sequences (see the Objects module). Objects may be deterministic or incorporate nondeterministic choice via the choose instruction. Alternative models could use different concurrency semantics, state representations, or behavioral specifications.

The instruction set operates uniformly over any object model satisfying the module parameters. Instructions create, destroy, and communicate with objects via ObjectId without depending on object internals.

Parametricity: This design preserves parametricity in the type-theoretic sense. The instruction semantics cannot inspect or pattern-match on the Object type because it is abstract. This ensures that programs written against the instruction set work correctly for any concrete object model instantiation. Returning raw Object values from instructions like createObj would break parametricity by exposing implementation details that should remain encapsulated within the interpreter.

Extrinsic metadata: Runtime information managed by the AVM, including the object's identifier, accumulated input history, and controller assignments. Controllers are distributed nodes that order and execute transactions; each object tracks which controller created it and which currently controls it, along with version numbers for distributed consistency. These runtime types are defined in the Runtime module.

The AVM as a state machine

The AVM executes programs that interact with objects through a well-defined instruction set. We model this instruction set as an algebraic datatype whose constructors represent the admissible operations—creating objects, invoking their behavior, transferring control, and managing transactions.

Programs have the type AVMProgram, whose semantics we define using interaction trees. This coinductive structure cleanly separates effectful operations (instructions) from pure computation. We derive an operational semantics as a small-step transition system over a global state containing all live objects and their metadata, enabling formal verification of properties like atomicity, serializability, and location transparency.

Runtime Execution Environment

The execution environment defines the runtime context in which AVM programs execute. This includes the object store (Store), transaction overlays (TxWrite), controller identity, pure function registry (PureFunctions), object metadata (ObjectMeta), and execution state (State). All runtime types are defined in the Runtime module and re-exported here for convenience.

Baseline Instruction Set Architecture (ISA)

The AVM instruction set architecture (ISA) defines the primitive operations available to programs. We have separated the instructions into different instruction sets, each with a different safety level and purpose/capabilities.

The instruction sets are organized into the following layers:

  • Base layer: Object lifecycle and communication
  • Introspection layer: Context queries and reflection
  • Transaction layer: Atomic execution contexts
  • Pure function layer: Pure computation via function registry
  • Distribution layer: Controller operations and cross-controller communication
  • Nondeterminism layer: Choice and constraint instructions for intent matching

These layers are independent and can be composed together to form custom instruction sets.

Here we model the instruction sets as an inductive datatype where each data constructor is one instruction in the corresponding instruction set.

Each instruction set datatype is indexed by the safety level of the instructions in the set. The safety level is a type-level function that signatures of the instruction set. The safety level is used to enforce that safe programs cannot call unsafe operations at compile time.

The type-level function signature of the instruction set is the following:

ISA : Set
ISA = Set -> Set

Safety datatype

Two safety levels are defined: Safe and Unsafe. Instructions below are marked as unsafe when they violate core principles of the object model or pose systemic risks.

data Safety : Set where
  Safe   : Safety
  Unsafe : Safety

Object Instructions

This is the first layer of the AVM ISA. It provides the fundamental object-oriented operations for creating and destroying objects, and communicating with them via message-passing.

ObjInstruction datatype

-- Object lifecycle and communication
data ObjInstruction : Safety  ISA where
  -- Object lifecycle
  createObj : Val  ObjInstruction Safe ObjectId
  destroyObj : ObjectId  ObjInstruction Safe Bool  -- May fail if object doesn't exist

  -- Message passing (may fail if object doesn't exist or rejects input)
  call : ObjectId  Input  ObjInstruction Safe (Maybe Output)

These instructions provide the fundamental object-oriented programming model. Objects are created with createObj, destroyed with destroyObj, and communicate via synchronous message-passing through call.

Introspection Instructions

This is the second layer of the AVM ISA. It provides the fundamental introspection operations for querying the execution environment and object metadata.

IntrospectInstruction datatype

Introspection instructions are used to query the execution environment and object metadata.

data IntrospectInstruction : Safety  ISA where

  -- Safe introspection
  self : IntrospectInstruction Safe ObjectId
  input : IntrospectInstruction Safe Input
  getCurrentMachine : IntrospectInstruction Safe MachineId

  -- Unsafe operations
  reflect : ObjectId  IntrospectInstruction Unsafe (Maybe ObjectMeta)

  -- Lookup operations
  scryMeta
    : (ObjectMeta  Bool) 
      IntrospectInstruction Unsafe (List (ObjectId × ObjectMeta))

  scryDeep
    : (Object  ObjectMeta  Bool) 
      IntrospectInstruction Unsafe (List (ObjectId × Object × ObjectMeta))

Here we see our first unsafe operations. These instructions break encapsulation and risk unbounded computation by triggering expensive global operations that scale with the total number of objects in the system.

The scryMeta instruction queries the store for objects whose metadata satisfies a given predicate. It returns pairs of object identifiers and metadata for all matching objects. The scryDeep instruction extends this by filtering on both object internals and metadata, returning the complete object data. Both instructions traverse the entire store. The two-tier design separates common metadata queries from rare deep inspection. Metadata queries preserve parametricity by not exposing the abstract Object type. Deep queries break this abstraction but avoid the performance inconsistency of filtering objects while returning only identifiers, which would force subsequent reflection calls.

The reflect instruction retrieves metadata for a specific object without invoking its message-passing interface. This violates the principle that objects control what information they expose. In a large system with millions of objects, scrying with a predicate that always returns true could force enumeration of the entire store, causing significant performance degradation.

Instr₀ instruction set, the minimal instruction set

We combine the first two layers into our first minimal instruction set, Instr₀. This is the minimal instruction set in the AVM ISA. It contains only the base object and introspection instructions.

data Instr₀ : Safety  ISA where
  Obj :  {s A}  ObjInstruction s A  Instr₀ s A
  Introspect :  {s A}  IntrospectInstruction s A  Instr₀ s A

With an instruction set in place, we can now define what are programs using only base operations. A program is an interaction tree of these instructions.

open import Foundation.InteractionTrees
AVMProgram₀  : Set  Set
AVMProgram₀ = ITree (Instr₀ Safe)

TxInstruction datatype

This is the third layer of the AVM ISA. It provides the fundamental transactional operations for managing atomic execution contexts.

data TxInstruction : Safety  ISA where
  beginTx : TxInstruction Safe TxId
  commitTx : TxId  TxInstruction Safe Bool      -- May fail if conflicts
  abortTx : TxId  TxInstruction Safe 

Instr₁ instruction set, adds transactional support

Programs that can roll back changes are now possible via TxTransaction instructions. With this feature, we can now define our second instruction set, Instr₁.

data Instr₁ : Safety  ISA where
  Obj :  {s A}  ObjInstruction s A  Instr₁ s A
  Introspect :  {s A}  IntrospectInstruction s A  Instr₁ s A
  Tx :  {s A}  TxInstruction s A  Instr₁ s A

Transactional programs:

AVMProgram₁  : Set  Set
AVMProgram₁ = ITree (Instr₁ Safe)

PureInstruction datatype

This is the fourth layer of the AVM ISA. This step adds means to call pure functions, and register new pure functions. So think of this as a capability to extend the any instruction set with deterministic computation.

data PureInstruction : Safety  ISA where
  -- Call a registered pure function by identifier
  callPure : String  List Val  PureInstruction Safe (Maybe Val)

  -- Register new pure function (unsafe - extends the function set)
  registerPure : String  (List Val  Maybe Val)  PureInstruction Unsafe Bool

Instr₂ instruction set, adds pure function computation

With the ability to call pure functions, we can now define our third instruction set, Instr₂. This instruction set adds pure function computation to the transactional instruction set.

data Instr₂ : Safety  ISA where
  Obj :  {s A}  ObjInstruction s A  Instr₂ s A
  Introspect :  {s A}  IntrospectInstruction s A  Instr₂ s A
  Tx :  {s A}  TxInstruction s A  Instr₂ s A
  Pure :  {s A}  PureInstruction s A  Instr₂ s A
AVMProgram₂  : Set  Set
AVMProgram₂ = ITree (Instr₂ Safe)

Distribution Layer: Machine and Controller Instructions

AVM programs execute in a distributed environment with two orthogonal concepts: physical machines (where computation and storage occur) and logical controllers (who order transactions and own objects). This separation enables independent reasoning about data locality and authority.

We split distribution into two instruction families to maintain clear separation of concerns, enable independent testing, and support distinct policy enforcement for physical versus logical operations.

MachineInstruction datatype

Machines are physical nodes that host computation and object data. Programs can query which machine holds an object's data and move execution or data between machines. Machine operations deal with physical resource location and process migration.

data MachineInstruction : Safety  ISA where
  -- Query physical machine location of object data
  getMachine : ObjectId  MachineInstruction Safe (Maybe MachineId)

  -- Move execution context (process) to another machine
  teleport : MachineId  MachineInstruction Safe Bool

  -- Move object data to another machine (changes physical location)
  moveObject : ObjectId  MachineId  MachineInstruction Safe Bool

Safety constraints: teleport is invalid during active transactions. Attempting to teleport while a transaction is in progress should result in an error.

ControllerInstruction datatype

Controllers are logical authorities that order transactions and own objects. Each object records which controller created it and which controller currently owns it, along with version numbers for distributed consistency. Programs can query controller ownership and transfer objects between controllers without moving their physical location.

data ControllerInstruction : Safety  ISA where
  -- Query controller identity and ownership
  getCurrentController : ControllerInstruction Safe ControllerId
  getController : ObjectId  ControllerInstruction Safe (Maybe ControllerId)

  -- Transfer object ownership to another controller (changes logical authority)
  transferObject : ObjectId  ControllerId  ControllerInstruction Safe Bool

  -- Controller version management for distributed consistency
  getCurrentVersion : ControllerInstruction Safe ControllerVersion
  switchVersion : ControllerVersion  ControllerInstruction Safe Bool

Authority requirements: transferObject requires proper authorization. The current controller must have authority to transfer the object. Future extensions will include endorsement protocols for cross-controller transfers.

The Instruction datatype

For usability, we provide an Instruction datatype that combines all instruction families with ergonomic pattern synonyms. This allows writing programs with a flat instruction namespace while maintaining compositional structure.

-- Union datatype combining all instruction families
data Instruction : Safety  ISA where
  Obj :  {s A}  ObjInstruction s A  Instruction s A
  Introspect :  {s A}  IntrospectInstruction s A  Instruction s A
  Tx :  {s A}  TxInstruction s A  Instruction s A
  Pure :  {s A}  PureInstruction s A  Instruction s A
  Machine :  {s A}  MachineInstruction s A  Instruction s A
  Controller :  {s A}  ControllerInstruction s A  Instruction s A

Pattern synonyms provide a flat namespace for common instructions. These pattern can also be used to pattern match on the instruction set.

Pattern synonym definitions (click to expand)
-- Object instruction patterns
pattern obj-create val = Obj (createObj val)
pattern obj-destroy oid = Obj (destroyObj oid)
pattern obj-call oid inp = Obj (call oid inp)

-- Introspection instruction patterns
pattern get-self = Introspect self
pattern get-input = Introspect input
pattern get-current-machine = Introspect getCurrentMachine
pattern obj-scry-meta pred = Introspect (scryMeta pred)
pattern obj-scry-deep pred = Introspect (scryDeep pred)
pattern obj-reflect oid = Introspect (reflect oid)

-- Transaction instruction patterns
pattern tx-begin = Tx beginTx
pattern tx-commit tid = Tx (commitTx tid)
pattern tx-abort tid = Tx (abortTx tid)

-- Pure function instruction patterns
pattern call-pure name args = Pure (callPure name args)
pattern register-pure name fn = Pure (registerPure name fn)

-- Machine instruction patterns (physical location and process migration)
pattern get-machine oid = Machine (getMachine oid)
pattern do-teleport mid = Machine (teleport mid)
pattern move-object oid mid = Machine (moveObject oid mid)

-- Controller instruction patterns (logical authority and ownership)
pattern get-current-controller = Controller getCurrentController
pattern get-controller oid = Controller (getController oid)
pattern transfer-object oid cid = Controller (transferObject oid cid)
pattern get-current-version = Controller getCurrentVersion
pattern switch-version ver = Controller (switchVersion ver)

The Instruction type provides the full AVM instruction set in a single datatype:

AVMProgram : Set  Set
AVMProgram = ITree (Instruction Safe)

This design combines compositionality (instruction families as separate types) with ergonomics (pattern synonyms for flat naming). Programs can use either the layered approach (Instr₀, Instr₁, etc.) for compositional reasoning or the unified Instruction type for convenience.

Instruction Set Semantics

We provided multiple instruction sets, organized by capability and safety level. This section provides comprehensive semantics for all instruction types. Each instruction includes its safety level and return type.

Let's recall how the instruction sets are organized by instruction family:

  1. ObjInstruction: Object instructions: Object lifecycle and communication
  2. IntrospectInstruction: Introspection instructions: Context queries and reflection
  3. TxInstruction: Transaction instructions: Atomic execution contexts
  4. PureInstruction: Pure function instructions: Deterministic computation via function registry
  5. MachineInstruction: Machine instructions: Physical location and process migration
  6. ControllerInstruction: Controller instructions: Logical authority and ownership transfer

Object Lifecycle

Object lifecycle instructions manage the creation and destruction of objects in the store.

createObj

createObj : Val → ObjInstruction Safe ObjectId

Creates a new object in the store from an Val specification. The Val may be used to specify the object's kind or initial program code. Returns a fresh ObjectId that uniquely identifies the new object. Object creation is transactional: if the enclosing transaction aborts, the object is not created.

destroyObj

destroyObj : ObjectId → ObjInstruction Safe Bool

Marks the object identified by the given ObjectId for destruction. Returns true if destruction succeeds, false if the object does not exist. The object is removed from the store, and subsequent references to this ObjectId will fail. Destruction is transactional: the object remains accessible within the current transaction until committed.

Object Interaction

Object interaction is achieved through pure message-passing, preserving object encapsulation. Message passing is the only way to interact with objects in the AVM.

call

call : ObjectId → Input → ObjInstruction Safe (Maybe Output)

Performs synchronous message passing to the object identified by the given ObjectId. Sends the input and blocks until the target object produces an output. Returns nothing if the object does not exist or rejects the input, otherwise returns just the output produced by the target object.

Introspection and Context

Introspection instructions query the execution environment and object metadata.

self

self : IntrospectInstruction Safe ObjectId

Returns the ObjectId of the currently executing object. This instruction is essential for recursion and self-reference, allowing an object to pass its own identifier to other objects or invoke itself. See also the use of self in defining purely functional resources in the AVM.

input

input : IntrospectInstruction Safe Input

Returns the input being processed by the current object. This instruction provides access to the message sent to the current object.

getCurrentMachine

getCurrentMachine : IntrospectInstruction Safe MachineId

Returns the identifier of the physical machine currently executing this program. This instruction enables programs to reason about their execution location, which is independent of controller identity. Machine information is useful for data locality optimizations and understanding cross-machine communication costs.

scryMeta (Unsafe)

scryMeta : (ObjectMeta → Bool) → IntrospectInstruction Unsafe (List (ObjectId × ObjectMeta))

Queries the store for objects whose metadata satisfies the given predicate. Returns pairs of object identifiers and metadata for all matching objects. This instruction traverses the entire store, applying the predicate to each object's metadata. Typical queries include finding all objects created by a specific controller, objects modified after a particular version, or objects whose history exceeds a certain length. The instruction preserves parametricity by not exposing the abstract Object type. Marked unsafe because enumeration scales with total object count.

scryDeep (Unsafe)

scryDeep : (Object → ObjectMeta → Bool) → IntrospectInstruction Unsafe (List (ObjectId × Object × ObjectMeta))

Queries the store for objects whose internals and metadata satisfy the given predicate. Returns complete object data for all matches. This instruction breaks encapsulation by exposing raw Object values, violating parametricity. It eliminates the performance inconsistency of filtering on object internals while returning only identifiers. Use cases include deep inspection for debugging, auditing, or migration. Marked unsafe for both encapsulation violation and unbounded computation.

reflect (Unsafe)

reflect : ObjectId → IntrospectInstruction Unsafe (Maybe ObjectMeta)

Retrieves metadata about the object identified by the given ObjectId. Returns nothing if the object does not exist, otherwise returns just the object's metadata. Marked as unsafe because it bypasses object encapsulation.

Transaction Control

Transaction control instructions manage atomic execution contexts. All state modifications within a transaction are tentative until committed.

beginTx

beginTx : TxInstruction Safe TxId

Initiates a new transactional context and returns a fresh transaction identifier. All subsequent state modifications are logged to the transaction's write-set until the transaction is either committed or aborted. Transactions provide atomicity: either all changes succeed or none do.

commitTx

commitTx : TxId → TxInstruction Safe Bool

Attempts to commit the transaction identified by the given TxId. Returns true if the commit succeeds, persisting all logged changes to the store. Returns false if the transaction cannot be committed due to conflicts with concurrent transactions or if the transaction was already finalized.

abortTx

abortTx : TxId → TxInstruction Safe ⊤

Aborts the transaction identified by the given TxId, discarding all tentative state changes in its write-set. The store reverts to its state before beginTx was called. This operation always succeeds and returns unit.

Pure Function Instructions

Pure function instructions provide deterministic computation capabilities through an extensible function registry.

callPure

callPure : String → List Val → PureInstruction Safe (Maybe Val)

Invokes a registered pure function by identifier with the given arguments. Returns the function result or nothing if the function doesn't exist or the arguments don't match the expected arity.

registerPure (Unsafe)

registerPure : String → (List Val → Maybe Val) → PureInstruction Unsafe Bool

Registers a new pure function in the function registry. Marked as unsafe because it extends the global function set, potentially affecting system-wide computation.

Machine Instructions

Machine instructions manage physical resource location and process migration in distributed AVM deployments. These operations deal with where computation executes and where object data physically resides.

getMachine

getMachine : ObjectId → MachineInstruction Safe (Maybe MachineId)

Returns the physical machine where the specified object's data resides, or nothing if the object doesn't exist. Machine location is independent of controller ownership. This information is useful for data locality optimization and understanding cross-machine communication costs.

teleport

teleport : MachineId → MachineInstruction Safe Bool

Moves the execution context (process) to the specified physical machine. Returns true if teleportation succeeds, false if the target machine is unreachable. This changes where computation happens but does not change controller identity or object ownership.

Constraint: This instruction is invalid during active transactions. The interpreter must reject teleportation attempts within a transaction boundary to maintain transaction integrity.

moveObject

moveObject : ObjectId → MachineId → MachineInstruction Safe Bool

Moves an object's data to a different physical machine. Returns true if the move succeeds, false if the target machine is unreachable. This changes the object's physical storage location but does not change its controller ownership. Machine migration enables data locality optimization and load balancing.

Controller Instructions

Controller instructions manage logical authority and ownership for distributed AVM deployments. Controllers order transactions and own objects. These operations deal with which controller has authority over objects and transaction ordering.

getCurrentController

getCurrentController : ControllerInstruction Safe ControllerId

Returns the identifier of the controller (logical authority) currently executing this program. Controllers order transactions and own objects. This is independent of the physical machine executing the code.

getController

getController : ObjectId → ControllerInstruction Safe (Maybe ControllerId)

Returns the controller responsible for the specified object, or nothing if the object doesn't exist. This queries the object's logical ownership, not its physical location. The controller determines transaction ordering for the object.

transferObject

transferObject : ObjectId → ControllerId → ControllerInstruction Safe Bool

Transfers logical ownership of an object to another controller. This changes which controller orders transactions for the object but does not move the object's data between machines. Returns true if the transfer succeeds, false if the transfer is unauthorized or the target controller is unreachable.

Authority requirement: The current controller must have authority to transfer the object. Future extensions will implement a two-phase transfer protocol with endorsement to maintain cross-controller integrity.

getCurrentVersion

getCurrentVersion : ControllerInstruction Safe ControllerVersion

Returns the current version/state of the executing controller. Version information is used for distributed consistency and to ensure controllers operate on compatible state.

switchVersion

switchVersion : ControllerVersion → ControllerInstruction Safe Bool

Attempts to switch to a specific controller version. Returns true if the version switch succeeds, false if the version is incompatible or unavailable. This enables controller upgrades and version-specific behavior.

Appendix: Instruction Set Extensions (Sketch)

To align with Anoma's intents and multi-party transactions, we sketch two additional instruction families. These are presented as a design stub; the operational semantics and interpreter integration are future work.

WeightedVal datatype

-- Weighted choices over values (preference-directed nondeterminism)
record WeightedVal : Set where
  field value : Val
        weight : 

NondetInstruction datatype

data NondetInstruction : Safety  ISA where
  -- Choose a value according to a (preference distribution)?
  choose : List Val  NondetInstruction Safe Val

  -- Constrain the current tentative transaction; if false at commit, abort
  require : Bool  NondetInstruction Safe 

ConstraintId datatype

-- Linear constraints within a transaction (multi-party coordination)
record ConstraintId : Set where 
  constructor mkCid
  field
    constraintId : 

LinearConstraint datatype

A linear constraint requires that a message send be used exactly once.

data LinearConstraint : Set where
  UseOnce : ObjectId  Input  LinearConstraint

ConstraintInstruction datatype

data ConstraintInstruction : Safety  ISA where
  -- Register a new linear constraint; returns its identifier
  newConstraint : LinearConstraint  ConstraintInstruction Safe ConstraintId

  -- Mark a constraint as satisfied by the current step
  satisfy : ConstraintId  ConstraintInstruction Safe Bool

Instr₄ instruction set: intent-aware ISA

Version 4 combines all instruction families including nondeterminism and constraints:

data Instr₄ : Safety  ISA where
  Obj :  {s A}  ObjInstruction s A  Instr₄ s A
  Introspect :  {s A}  IntrospectInstruction s A  Instr₄ s A
  Tx :  {s A}  TxInstruction s A  Instr₄ s A
  Pure :  {s A}  PureInstruction s A  Instr₄ s A
  Machine :  {s A}  MachineInstruction s A  Instr₄ s A
  Controller :  {s A}  ControllerInstruction s A  Instr₄ s A
  Nondet :  {s A}  NondetInstruction s A  Instr₄ s A
  Constr :  {s A}  ConstraintInstruction s A  Instr₄ s A
AVMProgram₄ : Set  Set
AVMProgram₄ = ITree (Instr₄ Safe)

References to other modules

This page references the following modules:


Module References

Referenced By

This module is referenced by:

References

This module references:

?>