Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(docs): More acir docs #7731

Merged
merged 9 commits into from
Mar 18, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
215 changes: 19 additions & 196 deletions acvm-repo/acir/README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,4 @@
# ACIR documentation (draft)

## Abstract

This document describes the purpose of ACIR, what it is and how ACIR programs
can be used by compilers and proving systems. It is intended to be a reference
documentation for ACIR.

## Introduction
# The Abstract Circuit Intermediate Representation (ACIR)

The purpose of ACIR is to make the link between a generic proving system, such
as Aztec's Barretenberg, and a frontend, such as Noir, which describes user
Expand All @@ -25,8 +17,8 @@ proving system, that can be used both by proving system as well as a target for
frontends. So, at the end of the day, an ACIR program is just another
representation of a program, dedicated to proving systems.

## Abstract Circuit Intermediate Representation
ACIR stands for abstract circuit intermediate representation:
## ACIR
ACIR stands for Abstract Circuit Intermediate Representation:
- **abstract circuit**: circuits are a simple computation model where basic
computation units, named gates, are connected with wires. Data flows
through the wires while gates compute output wires based on their input.
Expand All @@ -37,13 +29,13 @@ ACIR stands for abstract circuit intermediate representation:
lose any expressiveness when using a circuit as it is well known that any
bounded computation can be translated into an arithmetic circuit (i.e a
circuit with only addition and multiplication gates).
The term abstract here simply means that we do not refer to an actual physical
circuit (such as an electronic circuit). Furthermore, we will not exactly use
the circuit model, but another model even better suited to ZKPs, the constraint
model (see below).
The term abstract here simply means that we do not refer to an actual physical
circuit (such as an electronic circuit). Furthermore, we will not exactly use
the circuit model, but another model even better suited to ZKPs, the constraint
model (see below).
- **intermediate representation**: The ACIR representation is intermediate
because it lies between a frontend and its proving system. ACIR bytecode makes
the link between noir compiler output and the proving system backend input.
because it lies between a frontend and its proving system. ACIR bytecode makes
the link between noir compiler output and the proving system backend input.

## The constraint model

Expand All @@ -60,7 +52,6 @@ addition gate `output_wire = input_wire_1 + input_wire_2` can be expressed with
the following arithmetic constraint:
`output_wire - (input_wire_1 + input_wire_2) = 0`


## Solving

Because of these constraints, executing an ACIR program is called solving the
Expand Down Expand Up @@ -90,194 +81,26 @@ will refer to it as `FieldElement` or ACIR field. The proving system needs the
values of all the partial witnesses and all the constraints in order to generate
a proof.

*Remark*: The value of a partial witness is unique and fixed throughout a program
_Note_: The value of a partial witness is unique and fixed throughout a program
execution, although in some rare cases, multiple values are possible for a
same execution and witness (when there are several valid solutions to the
constraints). Having multiple possible values for a witness may indicate that
the circuit is not safe.

*Remark*: Why do we use the term partial witnesses? It is because the proving
_Note_: Why do we use the term partial witnesses? It is because the proving
system may create other constraints and witnesses (especially with
`BlackBoxFuncCall`, see below). A proof refers to a full witness assignments
and their constraints. ACIR opcodes and their partial witnesses are still an
`BlackBoxFuncCall`, see below). A proof refers to a full witness assignment
and its constraints. ACIR opcodes and their partial witnesses are still an
intermediate representation before getting the full list of constraints and
witnesses. For the sake of simplicity, we will refer to witness instead of
partial witness from now on.


## ACIR Reference

We assume here that the proving system is Barretenberg. Some parameters may
slightly change with another proving system, in particular the bit size of
`FieldElement`, which is 254 for Barretenberg.

Some opcodes have inputs and outputs, which means that the output is constrained
to be the result of the opcode computation from the inputs. The solver expects
that all inputs are known when solving such opcodes.

Some opcodes are not constrained, which means they will not be used by the
proving system and are only used by the solver.

Finally, some opcodes will have a predicate, whose value is `0` or `1`. Its
purpose is to nullify the opcode when the value is `0`, so that it has no
effect. Note that removing the opcode is not a solution because this modifies
the circuit (the circuit being mainly the list of the opcodes).

*Remark*: Opcodes operate on witnesses, but we will see that some opcode work on
_Note_: Opcodes operate on witnesses, but we will see that some opcodes work on
expressions of witnesses. We call an expression a linear combination of
witnesses and/or products of two witnesses (and also a constant term). A
witnesses and/or products of two witnesses (also with a constant term). A
single witness is a (simple) expression, and conversely, an expression can
be turned into a single witness using an assert-zero opcode (see below). So
basically, using witnesses or expressions is equivalent, but the latter can
avoid the creation of witness in some cases.

### AssertZero opcode

An `AssertZero` opcode adds the constraint that `P(w) = 0`, where
`w=(w_1,..w_n)` is a tuple of `n` witnesses, and `P` is a multi-variate
polynomial of total degree at most `2`.

### BlackBoxFuncCall opcode

These opcodes represent a specific computation. Even if any computation can be
done using only assert-zero opcodes, it is not always efficient. Some proving
systems, and in particular the proving system from Aztec, can implement several
computations more efficiently using for instance look-up tables. The
`BlackBoxFuncCall` opcode is used to ask the proving system to handle the
computation by itself.

All black box functions take as input a tuple `(witness, num_bits)`, where
`num_bits` is a constant representing the bit size of the input witness, and they
have one or several witnesses as output.

Some more advanced computations assume that the proving system has an
"embedded curve". It is a curve that cycles with the main curve of the proving
system, i.e the scalar field of the embedded curve is the base field of the main
one, and vice-versa. The curves used by the proving system are dependent on the
proving system (and/or its configuration). Aztec's Barretenberg uses BN254 as the
main curve and Grumpkin as the embedded curve.

NOTE: see the [black_box_functions](src/circuit/black_box_functions.rs) file for
the most up-to-date documentation on these opcodes.

The black box functions supported by ACIR are:

**AES128Encrypt**: ciphers the provided plaintext using AES128 in CBC mode, padding the input using PKCS#7.

**AND**: performs the bitwise AND of `lhs` and `rhs`. `bit_size` must be the same for both inputs.

**XOR**: performs the bitwise XOR of `lhs` and `rhs`. `bit_size` must be the same for both inputs.

**RANGE**: constraint the input to be of the provided bit size

**SHA256**: computes sha256 of the inputs

**Blake2s**: computes the Blake2s hash of the inputs, as specified in https://tools.ietf.org/html/rfc7693

**Blake3**: computes the Blake3 hash of the inputs

**SchnorrVerify**: Verify a Schnorr signature over the embedded curve

**PedersenCommitment**: Computes a Pedersen commitments of the inputs using generators of the embedded curve

**PedersenHash**: Computes a Pedersen commitments of the inputs and their number, using generators of the embedded curve

**EcdsaSecp256k1**: Verify an ECDSA signature over Secp256k1

**EcdsaSecp256r1**: Same as EcdsaSecp256k1, but done over another curve.

**MultiScalarMul**: scalar multiplication with a variable base/input point (`P`) of the embedded curve

**Keccak256**: Computes the Keccak-256 (Ethereum version) of the inputs.

**Keccakf1600**: Keccak Permutation function of width 1600

**EmbeddedCurveAdd**: Embedded curve addition

**BigIntAdd**: BigInt addition

**BigIntSub**: BigInt subtraction

**BigIntMul**: BigInt multiplication

**BigIntDiv**: BigInt division

**BigIntFromLeBytes**: BigInt from le bytes

**BigIntToLeBytes**: BigInt to le bytes

**Poseidon2Permutation**: Permutation function of Poseidon2

**Sha256Compression**: SHA256 compression function

**RecursiveAggregation**: verify a proof inside the circuit.

Computes a recursive aggregation object internally when verifying a proof inside
another circuit.
The outputted aggregation object will then be either checked in a
top-level verifier or aggregated upon again.
The aggregation object should be maintained by the backend implementer.

This opcode prepares the verification of the final proof.
In order to fully verify a recursive proof, some operations may still be required
to be done by the final verifier (e.g. a pairing check).
This is why this black box function does not say if verification is passing or not.
It delays the expensive part of verification out of the SNARK
and leaves it to the final verifier outside of the SNARK circuit.

This opcode also verifies that the key_hash is indeed a hash of verification_key,
allowing the user to use the verification key as private inputs and only
have the key_hash as public input, which is more performant.

**Warning: the key hash logic does not need to be part of the black box and subject to be removed.**

If one of the recursive proofs you verify with the black box function fails to
verify, then the verification of the final proof of the main ACIR program will
ultimately fail.

### Brillig

This opcode is used as a hint for the solver when executing (solving) the
circuit. The opcode does not generate any constraint and is usually the result
of the compilation of an unconstrained noir function.

Let's see an example with euclidean division.
The normal way to compute `a/b`, where `a` and `b` are 8-bits integers, is to
implement the Euclidean algorithm which computes in a loop (or recursively)
modulus of the kind 'a mod b'. Doing this computation requires a lot of steps to
be properly implemented in ACIR, especially the loop with a condition. However,
euclidean division can be easily constrained with one assert-zero opcode:
`a = bq+r`, assuming `q` is 8 bits and `r<b`. Since these assumptions can easily
written with a few range opcodes, euclidean division can in fact be implemented
with a small number of opcodes.

However, in order to write these opcodes we need the result of the division
which are the witness `q` and `r`. But from the constraint `a=bq+r`, how can the
solver figure out how to solve `q` and `r` with only one equation? This is where
brillig/unconstrained function come into action. We simply define a function that
performs the usual Euclid algorithm to compute `q` and `r` from `a` and `b`.
Since Brillig opcode does not generate constraint, it won't be provided to the
proving system but simply used by the solver to compute the values of `q` and
`r`.

In summary, solving a Brillig opcode performs the computation defined by its
bytecode, on the provided inputs, and assign the result to the outputs witnesses,
without adding any constraint.

NOTE: see the [circuit/opcodes.rs](src/circuit/opcodes.rs) file for the most
up-to-date documentation on these opcodes.

#### MemoryOp: memory abstraction for ACIR

ACIR is able to address any array of witnesses. Each array is assigned an ID
(`BlockId`) and needs to be initialized with the `MemoryInit` opcode. Then it is
possible to read and write from/to an array by providing the index and the value
we read/write, as arithmetic expression. Note that ACIR arrays all have a known
fixed length (given in the `MemoryInit` opcode below).

#### MemoryInit

Initialize an ACIR array from a vector of witnesses.
be turned into a single witness using an assert-zero opcode.
So basically, using witnesses or expressions is equivalent,
but the latter can avoid the creation of witness in some cases.

There must be only one MemoryInit per block_id, and MemoryOp opcodes must
come after the MemoryInit.
## Brillig
29 changes: 29 additions & 0 deletions acvm-repo/acir/src/circuit/brillig.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,31 @@
//! This module contains [Brillig][brillig] structures for integration within an ACIR circuit.
//!
//! [Brillig][brillig] is used in ACIR as a hint for the solver when executing the circuit.
//! Executing Brillig does not generate any constraints and is the result of the compilation of an unconstrained function.
//!
//! Let's see an example with euclidean division.
//! The normal way to compute `a/b`, where `a` and `b` are 8-bits integers, is to
//! implement the Euclidean algorithm which computes in a loop (or recursively)
//! modulus of the kind 'a mod b'. Doing this computation requires a lot of steps to
//! be properly implemented in ACIR, especially the loop with a condition. However,
//! euclidean division can be easily constrained with one assert-zero opcode:
//! `a = bq+r`, assuming `q` is 8 bits and `r<b`. Since these assumptions can easily
//! written with a few range opcodes, euclidean division can in fact be implemented
//! with a small number of opcodes.
//!
//! However, in order to write these opcodes we need the result of the division
//! which are the witness `q` and `r`. But from the constraint `a=bq+r`, how can the
//! solver figure out how to solve `q` and `r` with only one equation? This is where
//! brillig/unconstrained function come into action. We simply define a function that
//! performs the usual Euclidean algorithm to compute `q` and `r` from `a` and `b`.
//! Since executing Brillig does not generate constraints, it is not meaningful to the
//! proving system but simply used by the solver to compute the values of `q` and
//! `r`. The output witnesses `q` and `r` are then expected to be used by the proving system.
//!
//! In summary, executing Brillig will perform the computation defined by its
//! bytecode, on the provided inputs, and assign the result to the output witnesses,
//! without adding any constraints.

use super::opcodes::BlockId;
use crate::native_types::{Expression, Witness};
use brillig::Opcode as BrilligOpcode;
Expand Down Expand Up @@ -32,6 +60,7 @@ pub struct BrilligBytecode<F> {
}

/// Id for the function being called.
/// Indexes into the table of Brillig function's specified in a [program][super::Program]
#[derive(
Clone, Debug, PartialEq, Eq, Serialize, Deserialize, Hash, Copy, Default, PartialOrd, Ord,
)]
Expand Down
29 changes: 26 additions & 3 deletions acvm-repo/acir/src/circuit/mod.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
//! Native structures for representing ACIR

pub mod black_box_functions;
pub mod brillig;
pub mod opcodes;
Expand Down Expand Up @@ -39,7 +41,7 @@ pub enum ExpressionWidth {
},
}

/// A program represented by multiple ACIR circuits. The execution trace of these
/// A program represented by multiple ACIR [circuit][Circuit]'s. The execution trace of these
/// circuits is dictated by construction of the [crate::native_types::WitnessStack].
#[derive(Clone, PartialEq, Eq, Serialize, Deserialize, Default, Hash)]
#[cfg_attr(feature = "arb", derive(proptest_derive::Arbitrary))]
Expand All @@ -48,13 +50,20 @@ pub struct Program<F: AcirField> {
pub unconstrained_functions: Vec<BrilligBytecode<F>>,
}

/// Representation of a single ACIR circuit. The execution trace of this structure
/// is dictated by the construction of a [crate::native_types::WitnessMap]
#[derive(Clone, PartialEq, Eq, Serialize, Deserialize, Default, Hash)]
#[cfg_attr(feature = "arb", derive(proptest_derive::Arbitrary))]
pub struct Circuit<F: AcirField> {
// current_witness_index is the highest witness index in the circuit. The next witness to be added to this circuit
// will take on this value. (The value is cached here as an optimization.)
/// current_witness_index is the highest witness index in the circuit. The next witness to be added to this circuit
/// will take on this value. (The value is cached here as an optimization.)
pub current_witness_index: u32,
/// The circuit opcodes representing the relationship between witness values.
///
/// The opcodes should be further converted into a backend-specific circuit representation.
/// When initial witness inputs are provided, these opcodes can also be used for generating an execution trace.
pub opcodes: Vec<Opcode<F>>,
/// Maximum width of the [expression][Expression]'s which will be constrained
pub expression_width: ExpressionWidth,

/// The set of private inputs to the circuit.
Expand All @@ -76,20 +85,32 @@ pub struct Circuit<F: AcirField> {
pub assert_messages: Vec<(OpcodeLocation, AssertionPayload<F>)>,
}

/// Enumeration of either an [expression][Expression] or a [memory identifier][BlockId].
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize, Hash)]
#[cfg_attr(feature = "arb", derive(proptest_derive::Arbitrary))]
pub enum ExpressionOrMemory<F> {
Expression(Expression<F>),
Memory(BlockId),
}

/// Payload tied to an assertion failure.
/// This data allows users to specify feedback upon a constraint not being satisfied in the circuit.
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize, Hash)]
#[cfg_attr(feature = "arb", derive(proptest_derive::Arbitrary))]
pub struct AssertionPayload<F> {
/// Selector that maps a hash of either a constant string or an internal compiler error type
/// to an ABI type. The ABI type should then be used to appropriately resolve the payload data.
pub error_selector: u64,
/// The dynamic payload data.
///
/// Upon fetching the appropriate ABI type from the `error_selector`, the values
/// in this payload can be decoded into the given ABI type.
/// The payload is expected to be empty in the case of a constant string
/// as the string can be contained entirely within the error type and ABI type.
pub payload: Vec<ExpressionOrMemory<F>>,
}

/// Value for differentiating error types. Used internally by an [AssertionPayload].
#[derive(Debug, Copy, PartialEq, Eq, Hash, Clone, PartialOrd, Ord)]
pub struct ErrorSelector(u64);

Expand Down Expand Up @@ -134,6 +155,8 @@ pub enum OpcodeLocation {
Brillig { acir_index: usize, brillig_index: usize },
}

/// Index of Brillig opcode within a list of Brillig opcodes.
/// To be used by callers for resolving debug information.
#[derive(Debug, Copy, Clone, PartialEq, Eq, Hash, PartialOrd, Ord, Serialize, Deserialize)]
pub struct BrilligOpcodeLocation(pub usize);

Expand Down
Loading
Loading