-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HackerOne-2239704] Introduce an override for atomic batch operations #2296
[HackerOne-2239704] Introduce an override for atomic batch operations #2296
Conversation
I'm also going to check the failing test cases; it's likely that some of them are just using the low-level operations individually. Edit: it was actually just due to the in-memory storage making assertions that only apply to the persistent one; fixed. |
6a17280
to
64d36d7
Compare
64d36d7
to
d4eb70b
Compare
d4eb70b
to
5f50c0f
Compare
Rebased and addressed the review comment in a new commit. |
Signed-off-by: ljedrz <ljedrz@gmail.com>
Signed-off-by: ljedrz <ljedrz@gmail.com>
Signed-off-by: ljedrz <ljedrz@gmail.com>
Signed-off-by: ljedrz <ljedrz@gmail.com>
5f50c0f
to
6bfa9a3
Compare
Rebased and addressed the latest review comments in a new commit. I also noticed that the unhappy path for finalization needed to be adjusted, which was done in the last commit. |
Signed-off-by: ljedrz <ljedrz@gmail.com>
synthesizer/src/vm/mod.rs
Outdated
@@ -335,12 +335,32 @@ impl<N: Network, C: ConsensusStorage<N>> VM<N, C> { | |||
// Attention: The following order is crucial because if 'finalize' fails, we can rollback the block. | |||
// If one first calls 'finalize', then calls 'insert(block)' and it fails, there is no way to rollback 'finalize'. | |||
|
|||
// Enable the atomic batch override, so that both the insertion and finalization belong to a single batch. | |||
#[cfg(feature = "rocks")] | |||
assert!(self.block_store().flip_atomic_override()?); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be debug_assert!
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the plain assert
was intentional, as this ensures that it is initially disabled - if this was not the case, it would indicate a logic bug, and other atomic operations could be affected
synthesizer/src/vm/mod.rs
Outdated
self.finalize_store().atomic_abort(); | ||
// Disable the atomic batch override. | ||
assert!(!self.block_store().flip_atomic_override()?); | ||
} | ||
// Rollback the block. | ||
self.block_store().remove_last_n(1).map_err(|removal_error| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still incorrect.
Calling remove_last_n(1)
here will 1) correctly update the Merkle tree, however 2) it will also delete the penultimate block now from self.block_store()
.
This is erroneous because we are using this concept of a flip_atomic_override
to remove the pending last block, so the call to remove_last_n(1)
will now remove the penultimate block erroneously.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed; there are 2 ways I can see of tackling this; we can either:
- put the storage deletion within
remove_last_n
behind a feature flag (so that it doesn't happen withrocks
); however, this would obscure the purpose of the method and affect its use outside of typical runtime (and probably break some tests) - break the
remove_last_n
method into 2 distinct operations (isolated into new methods) - the tree update and the storage update - and only call the tree update here; I'll see if I can propose something "clean" for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I proposed the 2nd approach in ProvableHQ@1a27305. The scope of the write lock over the Merkle tree is different, but since the entirety of block insertion is behind the block_lock
, this shouldn't affect anything (I'm assuming that only block insertion/rollback can alter it).
Signed-off-by: ljedrz <ljedrz@gmail.com>
Signed-off-by: ljedrz <ljedrz@gmail.com>
Done, and tested locally. |
Signed-off-by: ljedrz <ljedrz@gmail.com>
6e5e390
to
f656d9f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This PR introduces a database-level override flag that does not allow an atomic operation to be executed until it's flipped manually.
This is a relatively simple solution to make block insertion and finalization happen as a single atomic storage operation; the reason why they can't be carried out like that "normally" is that they are disjoint in the storage hierarchy.
This should fix the storage corruption that can currently happen if the node is shut down between those two operations.
Filing it as a draft, as I'd like to further improve the related docs, but it worked when I tested it locally.
This PR should address: