Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Fuzzer for noir programs #6770

Open
wants to merge 105 commits into
base: master
Choose a base branch
from
Open

feat: Fuzzer for noir programs #6770

wants to merge 105 commits into from

Conversation

Rumata888
Copy link
Collaborator

@Rumata888 Rumata888 commented Dec 11, 2024

Description

A coverage-guided fuzzer for automatically testing Noir programs. The fuzzer generates and mutates test inputs to find bugs while maximizing code coverage.

Key Features

  • Coverage-guided fuzzing using both ACIR and Brillig execution modes
  • Parallel execution support for improved performance
  • Maintains and evolves a corpus of high-coverage test cases
  • Mutation-based fuzzing with crossover capabilities
  • Detects discrepancies between ACIR and Brillig execution modes
  • Supports corpus minimization
  • Can be used with an oracle for differential fuzzing
  • Configurable failure conditions via only_fail_with attribute

Documentation*

Check one:

  • No documentation needed.
  • Documentation included in this PR.
  • [For Experimental Features] Documentation to be submitted in a separate PR.

PR Checklist*

  • I have tested the changes locally.
  • I have formatted the changes with Prettier and/or cargo fmt on default settings.

Comment on lines +246 to +249
let input_vector = match previous_input {
InputValue::Vec(previous_input_vector) => previous_input_vector,
_ => panic!("Mismatch of AbiType and InputValue should not happen"),
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
let input_vector = match previous_input {
InputValue::Vec(previous_input_vector) => previous_input_vector,
_ => panic!("Mismatch of AbiType and InputValue should not happen"),
};
let InputValue::Vec(input_vector) = previous_input else {
panic!("Mismatch of AbiType and InputValue should not happen")
};

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can also be done in the Struct and Tuple cases.

}
result
}
/// Create a spliced version of 2 buffers, where each element in the result is at the same index as in the original ones
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you include an example of this? That is, an example first_buffer and second_buffer, with one possible outcome.

I also think it would be good to include small examples like this in all mutations so it's easier to understand what they do without having to read the code.

second_input_map: &InputMap,
prng: &mut XorShiftRng,
) -> InputMap {
// TODO: add the most basic splice where the whole map is converted into a witness map and then we splice those
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe capture a follow-up issue for this (or drop the TODO if it's not needed anymore)


impl<'a> StringMutator<'a> {
pub fn new(dictionary: &'a IntDictionary, prng: &'a mut XorShiftRng) -> Self {
let u8_dictionary = dictionary.get_dictionary_by_width(8);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense for the dictionary here to only consist of bytes that are ascii?

Comment on lines +23 to +27
fn structured_splice(
&mut self,
first_buffer: &[InputValue],
second_buffer: &[InputValue],
) -> Vec<InputValue> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this one and the one in string could be done by calling a generic helper function with T being InputValue or u8 (the logic seems to be the same)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same goes for swap and duplicate_chunk

}

/// Create buffer from random chunks of 2 buffers
fn chaotic_splice(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one too could be generalized over a T, though I see the logic between the two is slightly different though I don't know if that's intentional.

}

/// Swap 2 random chunks in the buffer
fn swap(&mut self, buffer: &Vec<InputValue>) -> Vec<InputValue> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method could be tested by having a separate function that receives all the values that are generated randomly (mainly to have a small proof that it works correctly).


/// Convert a value from integer representation to field representation
/// Width is needed to detect negative values in 2-complement form
fn i128_to_field(value: i128, width: u32) -> FieldElement {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use mut value so there's no need for new_value.


/// Generate a random unsigned integer of given width and convert to field
fn generate_random_for_width(prng: &mut XorShiftRng, width: u32) -> FieldElement {
FieldElement::from(prng.gen_range(0..(2i128 << width)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this 2 intentional here or should it be 1?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My mistake

.expect("There should be a witness in the witness map");

// Check that the values are zero or 1
if *value != FieldElement::zero() && *value != FieldElement::one() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency we could use is_zero/is_one here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


// Look for witnesses that are either 0 or 1
for (witness_index, value) in first_func_witnesses.witness.clone().into_iter() {
if value == FieldElement::one() || value == FieldElement::zero() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency we could use is_zero/is_one here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines +562 to +564
// 1. A new branch is taken
// 2. A branch is taken more times than ever encountered before in a single execution
// 3. A branch is taken pow2 times that hasn't been previously observed
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are 2 and 3 dependent on 1 here?

Comment on lines +589 to +596
let mut least_different_bits = u32::MAX;
let mut last_value = 0;
for i in 0..(cmp_approach.bits + 1) {
if !new_coverage.brillig_coverage[i + cmp_approach.raw_index].is_zero() {
least_different_bits = (cmp_approach.bits - i) as u32;
last_value = new_coverage.brillig_coverage[i + cmp_approach.raw_index];
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this use find_closest_comparison?

Comment on lines +665 to +666
feature_to_index_map.insert((opcode_index, usize::MAX - 1), total_features);
feature_to_index_map.insert((opcode_index, usize::MAX), total_features + 1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at the tracing code first and couldn't understand why usize::MAX and usize::MAX - 1 were used there, but now seeing this code I understand. I wonder if these could be extracted into documented constants so that it's easier to see where they come from and what they mean.

acvm::acir::brillig::BinaryFieldOp::Equals
| acvm::acir::brillig::BinaryFieldOp::LessThan
| acvm::acir::brillig::BinaryFieldOp::LessThanEquals => {
let features_per_comparison= 1 /*true */+1/*false */+255 /*possible bits() results*/;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be 254?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried changing it and stuff broke, so I'll leave it as is

Comment on lines 712 to 719
let features_per_comparison = 1 /*true */+1/*false */+1/*when ilog is zero*/+ match bit_size{
acvm::acir::brillig::IntegerBitSize::U1 => 1,
acvm::acir::brillig::IntegerBitSize::U8 => 8,
acvm::acir::brillig::IntegerBitSize::U16 => 16,
acvm::acir::brillig::IntegerBitSize::U32 => 32,
acvm::acir::brillig::IntegerBitSize::U64 => 64,
acvm::acir::brillig::IntegerBitSize::U128 => 128,
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems there's impl From<IntegerBitSize> for u32

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

}
coverage_items.push(BrilligCoverageItemRange::Comparison(CmpCoverageRange {
index: total_features,
bits: features_per_comparison - 3,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe keep the features_for_bits as a separate variable so we don't need to subtract here

total_weight: usize,
}

/// Configuration for selecting a byte value mutation
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment should go on top of the struct

@@ -0,0 +1,874 @@
use rand::Rng;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic for selecting a mutation based on weights inside a configuration is repeated here a lot. I was thinking how this could be avoided and came up with this: https://gist.github.com/asterite/cbea91f23ec600be288bcc39b0a3d14e

The only downside to it that I see is that you might forget to include an option in the config... however, I think that downside also exists in the current code: if you add a new option you must remember to add it to the configuration, and you must also remember to add to the total weight.

Comment on lines +125 to +127
let parsed_source = parse_json(&source, &self.abi).map_err(|parsing_error| {
format!("Error while parsing file {:?}: {:?}", path.as_os_str(), parsing_error)
})?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you run nargo fuzz against one function and then change that function's signature and run nargo fuzz again it's almost sure you'll get an error here:

[one] Executed fuzzing task on test...issue with corpus:
Error while parsing file "corpus/one/test/060a619ba2124046f9b324b52ba4526b7896ed40ed7b2bc52a962eac1a178637.json": MissingArgument("y")
[one] 1 test failed

I guess that's fine, but I wonder if we can guide the user towards fixing that error. For example I deleted that file and it failed again because there was another file. After deleting both files it succeeded.

Not something to improve here, but maybe capture a follow-up issue for this.

}

/// Start the fuzzing campaign
pub fn fuzz(&mut self) -> FuzzTestResult {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if it were possible to break this method down into using some helper functions, mainly because it's very long and also to turn individual processing chunks into separate functions.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already broken up, but I'll try

Copy link
Collaborator

@asterite asterite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incredible work!

The code looks great. I must admit that I still don't fully understand 100% of it but mainly because it's a lot to absorb (just yesterday I learned about greybox fuzzing). That said, I think I understand the gist of it, also thanks to our pairing today. Also, it seems to work very well, for the example programs and for some things I tried.

I did left many suggestions, mainly about organizing code, removing duplication and making code clearer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants