-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Fuzzer for noir programs #6770
base: master
Are you sure you want to change the base?
Conversation
let input_vector = match previous_input { | ||
InputValue::Vec(previous_input_vector) => previous_input_vector, | ||
_ => panic!("Mismatch of AbiType and InputValue should not happen"), | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
let input_vector = match previous_input { | |
InputValue::Vec(previous_input_vector) => previous_input_vector, | |
_ => panic!("Mismatch of AbiType and InputValue should not happen"), | |
}; | |
let InputValue::Vec(input_vector) = previous_input else { | |
panic!("Mismatch of AbiType and InputValue should not happen") | |
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can also be done in the Struct
and Tuple
cases.
} | ||
result | ||
} | ||
/// Create a spliced version of 2 buffers, where each element in the result is at the same index as in the original ones |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you include an example of this? That is, an example first_buffer and second_buffer, with one possible outcome.
I also think it would be good to include small examples like this in all mutations so it's easier to understand what they do without having to read the code.
second_input_map: &InputMap, | ||
prng: &mut XorShiftRng, | ||
) -> InputMap { | ||
// TODO: add the most basic splice where the whole map is converted into a witness map and then we splice those |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe capture a follow-up issue for this (or drop the TODO if it's not needed anymore)
|
||
impl<'a> StringMutator<'a> { | ||
pub fn new(dictionary: &'a IntDictionary, prng: &'a mut XorShiftRng) -> Self { | ||
let u8_dictionary = dictionary.get_dictionary_by_width(8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense for the dictionary here to only consist of bytes that are ascii?
fn structured_splice( | ||
&mut self, | ||
first_buffer: &[InputValue], | ||
second_buffer: &[InputValue], | ||
) -> Vec<InputValue> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this one and the one in string could be done by calling a generic helper function with T
being InputValue
or u8
(the logic seems to be the same)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same goes for swap
and duplicate_chunk
} | ||
|
||
/// Create buffer from random chunks of 2 buffers | ||
fn chaotic_splice( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one too could be generalized over a T
, though I see the logic between the two is slightly different though I don't know if that's intentional.
} | ||
|
||
/// Swap 2 random chunks in the buffer | ||
fn swap(&mut self, buffer: &Vec<InputValue>) -> Vec<InputValue> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method could be tested by having a separate function that receives all the values that are generated randomly (mainly to have a small proof that it works correctly).
|
||
/// Convert a value from integer representation to field representation | ||
/// Width is needed to detect negative values in 2-complement form | ||
fn i128_to_field(value: i128, width: u32) -> FieldElement { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use mut value
so there's no need for new_value
.
|
||
/// Generate a random unsigned integer of given width and convert to field | ||
fn generate_random_for_width(prng: &mut XorShiftRng, width: u32) -> FieldElement { | ||
FieldElement::from(prng.gen_range(0..(2i128 << width))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this 2 intentional here or should it be 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My mistake
.expect("There should be a witness in the witness map"); | ||
|
||
// Check that the values are zero or 1 | ||
if *value != FieldElement::zero() && *value != FieldElement::one() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency we could use is_zero
/is_one
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
// Look for witnesses that are either 0 or 1 | ||
for (witness_index, value) in first_func_witnesses.witness.clone().into_iter() { | ||
if value == FieldElement::one() || value == FieldElement::zero() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency we could use is_zero
/is_one
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
// 1. A new branch is taken | ||
// 2. A branch is taken more times than ever encountered before in a single execution | ||
// 3. A branch is taken pow2 times that hasn't been previously observed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are 2 and 3 dependent on 1 here?
let mut least_different_bits = u32::MAX; | ||
let mut last_value = 0; | ||
for i in 0..(cmp_approach.bits + 1) { | ||
if !new_coverage.brillig_coverage[i + cmp_approach.raw_index].is_zero() { | ||
least_different_bits = (cmp_approach.bits - i) as u32; | ||
last_value = new_coverage.brillig_coverage[i + cmp_approach.raw_index]; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this use find_closest_comparison
?
feature_to_index_map.insert((opcode_index, usize::MAX - 1), total_features); | ||
feature_to_index_map.insert((opcode_index, usize::MAX), total_features + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at the tracing code first and couldn't understand why usize::MAX
and usize::MAX - 1
were used there, but now seeing this code I understand. I wonder if these could be extracted into documented constants so that it's easier to see where they come from and what they mean.
acvm::acir::brillig::BinaryFieldOp::Equals | ||
| acvm::acir::brillig::BinaryFieldOp::LessThan | ||
| acvm::acir::brillig::BinaryFieldOp::LessThanEquals => { | ||
let features_per_comparison= 1 /*true */+1/*false */+255 /*possible bits() results*/; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be 254?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried changing it and stuff broke, so I'll leave it as is
let features_per_comparison = 1 /*true */+1/*false */+1/*when ilog is zero*/+ match bit_size{ | ||
acvm::acir::brillig::IntegerBitSize::U1 => 1, | ||
acvm::acir::brillig::IntegerBitSize::U8 => 8, | ||
acvm::acir::brillig::IntegerBitSize::U16 => 16, | ||
acvm::acir::brillig::IntegerBitSize::U32 => 32, | ||
acvm::acir::brillig::IntegerBitSize::U64 => 64, | ||
acvm::acir::brillig::IntegerBitSize::U128 => 128, | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems there's impl From<IntegerBitSize> for u32
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
} | ||
coverage_items.push(BrilligCoverageItemRange::Comparison(CmpCoverageRange { | ||
index: total_features, | ||
bits: features_per_comparison - 3, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe keep the features_for_bits as a separate variable so we don't need to subtract here
total_weight: usize, | ||
} | ||
|
||
/// Configuration for selecting a byte value mutation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment should go on top of the struct
@@ -0,0 +1,874 @@ | |||
use rand::Rng; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic for selecting a mutation based on weights inside a configuration is repeated here a lot. I was thinking how this could be avoided and came up with this: https://gist.github.com/asterite/cbea91f23ec600be288bcc39b0a3d14e
The only downside to it that I see is that you might forget to include an option in the config... however, I think that downside also exists in the current code: if you add a new option you must remember to add it to the configuration, and you must also remember to add to the total weight.
let parsed_source = parse_json(&source, &self.abi).map_err(|parsing_error| { | ||
format!("Error while parsing file {:?}: {:?}", path.as_os_str(), parsing_error) | ||
})?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you run nargo fuzz
against one function and then change that function's signature and run nargo fuzz
again it's almost sure you'll get an error here:
[one] Executed fuzzing task on test...issue with corpus:
Error while parsing file "corpus/one/test/060a619ba2124046f9b324b52ba4526b7896ed40ed7b2bc52a962eac1a178637.json": MissingArgument("y")
[one] 1 test failed
I guess that's fine, but I wonder if we can guide the user towards fixing that error. For example I deleted that file and it failed again because there was another file. After deleting both files it succeeded.
Not something to improve here, but maybe capture a follow-up issue for this.
} | ||
|
||
/// Start the fuzzing campaign | ||
pub fn fuzz(&mut self) -> FuzzTestResult { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if it were possible to break this method down into using some helper functions, mainly because it's very long and also to turn individual processing chunks into separate functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is already broken up, but I'll try
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incredible work!
The code looks great. I must admit that I still don't fully understand 100% of it but mainly because it's a lot to absorb (just yesterday I learned about greybox fuzzing). That said, I think I understand the gist of it, also thanks to our pairing today. Also, it seems to work very well, for the example programs and for some things I tried.
I did left many suggestions, mainly about organizing code, removing duplication and making code clearer.
Description
A coverage-guided fuzzer for automatically testing Noir programs. The fuzzer generates and mutates test inputs to find bugs while maximizing code coverage.
Key Features
only_fail_with
attributeDocumentation*
Check one:
PR Checklist*
cargo fmt
on default settings.