You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: yellow-paper/docs/public-vm/avm.md
+166-20
Original file line number
Diff line number
Diff line change
@@ -12,17 +12,19 @@ Many terms and definitions here are borrowed from the [Ethereum Yellow Paper](ht
12
12
:::
13
13
14
14
## Introduction
15
-
An Aztec transaction may include one or more public execution requests. A public execution request represents an initial "message call" to a contract, providing input data and triggering the execution of that contract's public code in the Aztec Virtual Machine. Given a message call to a contract, the AVM executes the corresponding code one instruction at a time, treating each instruction as a transition function on its state.
15
+
An Aztec transaction may include one or more **public execution requests**. A public execution request represents an initial **message call** to a contract, providing input data and triggering the execution of that contract's public code in the Aztec Virtual Machine. Given a message call to a contract, the AVM executes the corresponding code one instruction at a time, treating each instruction as a transition function on its state.
16
16
17
17
> Public execution requests may originate as [`enqueuedPublicFunctionCalls`](../calls/enqueued-calls.md) triggered during the transaction's private execution.
18
18
19
19
This document contains the following sections:
20
-
-**Public contract bytecode** (aka AVM bytecode)
21
-
-**Execution Context**, outlining the AVM's environment and state
22
-
-**Execution**, outlining control flow, gas tracking, halting, and reverting
23
-
-**Nested calls**, outlining the initiation of message calls, processing of sub-context results, gas refunds, and world state reverts
-[**Execution context**](#execution-context), outlining the AVM's environment and state
22
+
-[**Execution**](#execution), outlining control flow, gas tracking, halting, and reverting
23
+
-[**Nested calls**](#nested-calls), outlining the initiation of message calls, processing of sub-context results, gas refunds, and world state reverts
24
24
25
-
The **["AVM Instruction Set"](./InstructionSet)** document supplements this one with the list of all supported instructions and their associated state transition functions.
25
+
Refer to the **["AVM Instruction Set"](./InstructionSet)** for the list of all supported instructions and their associated state transition functions.
26
+
27
+
For details on the AVM's "tagged" memory model, refer to the **["AVM Memory Model"](./state-model.md)**.
26
28
27
29
> Note: The Aztec Virtual Machine, while designed with a SNARK implementation in mind, is not strictly tied to any particular implementation and therefore is defined without SNARK or circuit-centric verbiage. That being said, considerations for a SNARK implementation are raised or linked when particularly relevant or helpful.
28
30
@@ -33,12 +35,14 @@ A contract's public bytecode is a series of execution instructions for the AVM.
33
35
34
36
> Note: See the [Bytecode Validation Circuit](./bytecode-validation-circuit.md) to see how a contract's bytecode can be validated and committed to.
35
37
38
+
Refer to ["Bytecode"](/docs/bytecode) for more information.
39
+
36
40
## Execution Context
37
41
:::note REMINDER
38
42
Many terms and definitions here are borrowed from the [Ethereum Yellow Paper](https://ethereum.github.io/yellowpaper/paper.pdf).
39
43
:::
40
44
41
-
An "Execution Context" includes the information necessary to initiate AVM execution along with the state maintained by the AVM throughout execution:
45
+
An **execution context** includes the information necessary to initiate AVM execution along with the state maintained by the AVM throughout execution:
42
46
```
43
47
AVMContext {
44
48
environment: ExecutionEnvironment,
@@ -49,11 +53,11 @@ AVMContext {
49
53
}
50
54
```
51
55
52
-
The first two entries, "Execution Environment" and "Machine State", share the same lifecycle. They contain information pertaining to a single message call and are initialized prior to the start of a call's execution.
56
+
The first two entries, **execution environment** and **machine state**, share the same lifecycle. They contain information pertaining to a single message call and are initialized prior to the start of a call's execution.
53
57
54
58
> When a nested message call is made, a new environment and machine state are initialized by the caller. In other words, a nested message call has its own environment and machine state which are _partially_ derived from the caller's context.
55
59
56
-
The "Execution Environment" is fully specified by a message call's execution agent and remains constant throughout a call's execution.
60
+
The **execution environment** is fully specified by a message call's execution agent and remains constant throughout a call's execution.
57
61
```
58
62
ExecutionEnvironment {
59
63
address,
@@ -73,7 +77,7 @@ ExecutionEnvironment {
73
77
}
74
78
```
75
79
76
-
"Machine State" is partially specified by the execution agent, and otherwise begins as empty or uninitialized for each message call. This state is transformed on an instruction-per-instruction basis.
80
+
**Machine state** is partially specified by the execution agent, and otherwise begins as empty or uninitialized for each message call. This state is transformed on an instruction-per-instruction basis.
77
81
```
78
82
MachineState {
79
83
l1GasLeft,
@@ -83,30 +87,30 @@ MachineState {
83
87
}
84
88
```
85
89
86
-
"World State" contains persistable VM state. If a message call succeeds, its world state updates are applied to the calling context (whether that be a parent call's context or the transaction context). If a message call fails, its world state updates are rejected by its caller. When a _transaction_ succeeds, its world state updates persist into future transactions.
90
+
**World state** contains persistable VM state. If a message call succeeds, its world state updates are applied to the calling context (whether that be a parent call's context or the transaction context). If a message call fails, its world state updates are rejected by its caller. When a _transaction_ succeeds, its world state updates persist into future transactions.
noteHashes: (address, index) => noteHash, // read & append only
95
+
nullifiers: (address, index) => nullifier, // read & append only
96
+
l1l2messageHashes: (address, key) => messageHash, // read only
97
+
contracts: (address) => {bytecode, portalAddress}, // read only
94
98
}
95
99
```
96
100
97
101
> Note: the notation `key => value` describes a mapping from `key` to `value`.
98
102
99
103
> Note: each member of the world state is implemented as an independent merkle tree with different properties.
100
104
101
-
The "Accrued Substate", as coined in the [Ethereum Yellow Paper](https://ethereum.github.io/yellowpaper/paper), contains information that is accrued throughout transaction execution to be "acted upon immediately following the transaction." These are append-only arrays containing state that is not relevant to other calls or transactions. Similar to world state, if a message call succeeds, its substate is appended to its calling context, but if it fails its substate is dropped by its caller.
105
+
The **accrued substate**, as coined in the [Ethereum Yellow Paper](https://ethereum.github.io/yellowpaper/paper), contains information that is accrued throughout transaction execution to be "acted upon immediately following the transaction." These are append-only arrays containing state that is not relevant to other calls or transactions. Similar to world state, if a message call succeeds, its substate is appended to its calling context, but if it fails its substate is dropped by its caller.
102
106
```
103
107
AccruedSubstate {
104
108
logs: [], // append-only
105
109
l2toL1Messages: [], // append-only
106
110
}
107
111
```
108
112
109
-
Finally, when a message call halts, it sets the context's "Message Call Results" to communicate results to the caller.
113
+
Finally, when a message call halts, it sets the context's **message call results** to communicate results to the caller.
110
114
```
111
115
MessageCallResults {
112
116
reverted: boolean,
@@ -115,8 +119,7 @@ MessageCallResults {
115
119
```
116
120
117
121
### Context initialization for initial call
118
-
This section outlines AVM context initialization specifically for a **public execution request's initial message call** (_i.e._ not a nested message call). Context initialization for nested message calls will be explained in a later section.
119
-
122
+
This section outlines AVM context initialization specifically for a **public execution request's initial message call** (_i.e._ not a nested message call). Context initialization for nested message calls will be explained [in a later section](#context-initialization-for-a-nested-call).
120
123
When AVM execution is initiated for a public execution request, the AVM context is initialized as follows:
> Note: unlike memory in the Ethereum Virtual Machine, uninitialized memory in the AVM is not readable! A memory cell must be written (and therefore [type-tagged](./state-model#types-and-tagged-memory)) before it can be read.
168
+
169
+
## Execution
170
+
With an initialized context (and therefore an initial program counter of 0), the AVM can execute a message call starting with the very first instruction in its bytecode.
171
+
172
+
### Program Counter and Control Flow
173
+
The program counter (machine state's `pc`) determines which instruction to execute (`instr = environment.bytecode[pc]`). Each instruction's state transition function updates the program counter in some way, which allows the VM to progress to the next instruction at each step.
174
+
175
+
Most instructions simply increment the program counter by 1. This allows VM execution to flow naturally from instruction to instruction. Some instructions ([`JUMP`](./InstructionSet#isa-section-jump), [`JUMPI`](./InstructionSet#isa-section-jumpi), `INTERNALCALL`, `INTERNALRETURN`) modify the program counter based on inputs.
176
+
177
+
`JUMP`, `JUMPI`, and `INTERNALCALL` assign a new value to program counter from a constant present in the bytecode. These instructions never assign a value from memory to program counter. Before jumping, the `INTERNALCALL` instruction pushes the current program counter to an internal call-stack that is maintained in a reserved region of memory. `INTERNALRETURN` pops a destination from that internal call-stack and jumps there. Thus, jump destinations, can be either constants from the contract bytecode, or destinations popped from the internal call-stack.
178
+
179
+
### Gas limits and tracking
180
+
Each instruction has an associated `l1GasCost` and `l2GasCost`. Before an instruction is executed, the VM enforces that there is sufficient gas remaining via the following assertions:
> Note: many instructions (like arithmetic operations) have 0 `l1GasCost`. Instructions only incur an L1 cost if they modify world state or accrued substate.
186
+
187
+
If these assertions pass, the machine state's gas left is decreased prior to the instruction's core execution:
188
+
```
189
+
machineState.l1GasLeft -= instr.l1GasCost
190
+
machineState.l2GasLeft -= instr.l2GasCost
191
+
```
192
+
193
+
If either of these assertions _fail_ for an instruction, this triggers an exceptional halt. The gas left is set to 0 and execution reverts.
194
+
```
195
+
machineState.l1GasLeft = 0
196
+
machineState.l2GasLeft = 0
197
+
```
198
+
> Reverting and exceptional halts will be covered in more detail [in a later section](#halting).
199
+
200
+
### Gas cost notes and examples
201
+
A instruction's gas cost is loosely derived from its complexity. Execution complexity of some instructions changes based on inputs. Here are some examples and important notes:
202
+
-[`JUMP`](./InstructionSet/#isa-section-jump) is an example of an instruction with constant gas cost. Regardless of its inputs, the instruction always incurs the same `l1GasCost` and `l2GasCost`.
203
+
- The [`SET`](./InstructionSet/#isa-section-set) instruction operates on a different sized constant (based on its `dst-type`). Therefore, this instruction's gas cost increases with the size of its input.
204
+
- Instructions that operate on a data range of a specified "size" scale in cost with that size. An example of this is the [`CALLDATACOPY`](./InstructionSet/#isa-section-calldatacopy) argument which copies `copySize` words from `environment.calldata` to memory.
205
+
- The [`CALL`](./InstructionSet/#isa-section-call)/[`STATICCALL`](./InstructionSet/#isa-section-call)/`DELEGATECALL` instruction's gas cost is determined by its `l*Gas` arguments, but any gas unused by the triggered message call is refunded after its completion (more on this later).
206
+
- An instruction with "offset" arguments (like [`ADD`](./InstructionSet/#isa-section-add) and many others), has increased cost for each offset argument that is flagged as "indirect".
207
+
208
+
> Implementation detail: an instruction's gas cost will roughly align with the number of rows it corresponds to in the SNARK execution trace including rows in the sub-operation table, memory table, chiplet tables, etc.
209
+
210
+
> Implementation detail: an instruction's gas cost takes into account the costs of associated downstream computations. So, an instruction that triggers accesses to the public data tree (`SLOAD`/`SSTORE`) incurs a cost that accounts for state access validation in later circuits (public kernel or rollup). An instruction that triggers a nested message call (`CALL`/`STATICCALL`/`DELEGATECALL`) incurs a cost accounting for the nested call's execution and an added execution of the public kernel circuit.
211
+
212
+
## Halting
213
+
A message call's execution can end with a **normal halt** or **exceptional halt**. A halt ends execution within the current context and returns control flow to the calling context.
214
+
215
+
### Normal halting
216
+
A normal halt occurs when the VM encounters an explicit halting instruction ([`RETURN`](./InstructionSet/#isa-section-return) or [`REVERT`](./InstructionSet/#isa-section-revert)). Such instructions consume gas normally and optionally initialize some output data before finally halting execution within the current context.
> Definitions: `retOffset` and `retSize` here are arguments to the [`RETURN`](./InstructionSet/#isa-section-return) and [`REVERT`](./InstructionSet/#isa-section-revert) instructions. If `retSize` is 0, the context will have no output. Otherwise, these arguments point to a region of memory to output.
224
+
225
+
### Exceptional halting
226
+
An exceptional halt is not explicitly triggered by an instruction but instead occurs when one of the following halting conditions is met:
1. **World state modification attempt during a static call**
243
+
```
244
+
assert !environment.isStaticCall
245
+
or environment.bytecode[machineState.pc].opcode not in WS_MODIFYING_OPS
246
+
```
247
+
> Definition: `WS_MODIFYING_OPS` represents the list of all opcodes corresponding to instructions that modify world state.
248
+
249
+
When an exceptional halt occurs, the context is flagged as consuming all off its allocated gas and marked as `reverted` with no output data, and then execution within the current context ends.
250
+
```
251
+
machineState.l1GasLeft = 0
252
+
machineState.l2GasLeft = 0
253
+
results.reverted = true
254
+
// results.output remains undefined
255
+
```
256
+
257
+
## Nested calls
258
+
During a message call's execution, an instruction may be encountered that triggers another message call. A message call triggered in this way may be referred to as a **nested call**. The purpose of the [`CALL`](./InstructionSet/#isa-section-call), [`STATICCALL`](./InstructionSet/#isa-section-staticcall), and `DELEGATECALL` instructions is to initiate nested calls.
259
+
260
+
261
+
### Context initialization for a nested call
262
+
Initiation of a nested call requires the creation of a new context (or **sub-context**).
263
+
```
264
+
subContext = AVMContext {
265
+
environment: nestedExecutionEnvironment, // defined below
266
+
machineState: nestedMachineState, // defined below
267
+
worldState: callingContext.worldState,
268
+
accruedSubstate: empty,
269
+
results: INITIAL_MESSAGE_CALL_RESULTS,
270
+
}
271
+
```
272
+
While some context members are initialized as empty (as they are for an initial message call), other entries are derived from the calling context or from the message call instruction's arguments (`instr.args`).
273
+
274
+
The world state is forwarded as-is to the sub-context. Any updates made to the world state before this message call instruction was encountered are carried forward into the sub-context.
275
+
276
+
The environment and machine state for the new sub-context are initialized as shown below. Here, the `callingContext` refers to the context in which the nested message call instruction was encountered.
> Note: recall that `INITIAL_MESSAGE_CALL_RESULTS` is the same initial value used during [context initialization for a public execution request's initial message call](#context-initialization-for-initial-call).
308
+
> `STATICCALL_OP` and `DELEGATECALL_OP` refer to the 8-bit opcode values for the `STATICCALL` and `DELEGATECALL` instructions respectively.
309
+
310
+
### Updating the calling context after nested call halts
0 commit comments