Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add first persistence in client #268

Closed
wants to merge 1 commit into from

Conversation

oxade
Copy link
Contributor

@oxade oxade commented Jan 25, 2022

First piece, part of #257

The goal here is to move towards persisting client state and objects, and hopefully be able to recover a client after a crash without too much syncing.

This PR starts by adding some basic storage logic and a simple use-case which checks for pending transfer.

Overhauling the client to support persistence and recovery was reaching thousands of lines, so I chunked it out into small parts to avoid conflicting Patrick's work.

Next steps:

  1. Client should store its own objects (and not just the refs). This allows more robust digest checks too and can avoud hitting authorities for downloads
  2. Client should implement recovery on crash by loading its data structures from disk
  3. Client should use deterministic path for DB
  4. Client should archive sent certificates for later auditing/sanity checks

@@ -151,17 +158,18 @@ impl<A> ClientState<A> {
.iter()
.map(|(id, (_, seq, _))| (*id, *seq))
.collect();
let path = env::temp_dir().join(format!("DB_{:?}", ObjectID::random()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

random? shouldn't this read and write to the same path?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this is mentioned as part of the TODOs once we add state recovery

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tempfile is a good Rust crate to save you from creating a filename from an object ID.

@@ -151,17 +158,18 @@ impl<A> ClientState<A> {
.iter()
.map(|(id, (_, seq, _))| (*id, *seq))
.collect();
let path = env::temp_dir().join(format!("DB_{:?}", ObjectID::random()));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tempfile is a good Rust crate to save you from creating a filename from an object ID.

use typed_store::rocks::{open_cf, DBMap};
use typed_store::traits::Map;

const PENDING_TRANSFER_KEY: &str = "PENDING_TRANSFER_KEY";
Copy link
Collaborator

@sblackshear sblackshear Jan 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assumed renaming "transfer" to "transaction" is part of the planned terminological refactor (since orders can do much more than transfer!)? Even if we are holding off on that refactor for now, let's avoid introducing new instances of this confusing term that we'll need to change later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@patrickkuo We'll need to extend the use of this flag to other operations once I change it to PENDING_TRANSACTION

impl ClientStore {
/// Open an client store by directory path
pub fn open<P: AsRef<Path>>(path: P, db_options: Option<Options>) -> ClientStore {
let db = open_cf(&path, db_options, &["pending_transfer"]).expect("Cannot open DB.");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the string pending_transfer do here + below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the name of the column family. This was adapter form authority_store.
However I think we can remove it for this trivial case as Rocks DB defaults to 'default' column family

.expect("Cannot open pending_transfer CF."),
}
}
// Get the pending tx if any
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: doc comments on all public functions

.get(&PENDING_TRANSFER_KEY.to_string())
.map_err(|e| e.into())
}
// Set the pending tx if any
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Set the pending tx if any
// Set the pending tx

Should also specify what happens if you try to set and there's already a tx there.

.insert(&PENDING_TRANSFER_KEY.to_string(), order)
.map_err(|e| e.into())
}
pub fn clear_pending_transfer(&self) -> Result<(), FastPayError> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment should specify what happens if you try to clear and there's nothing there.

@@ -724,14 +728,15 @@ where
with_confirmation: bool,
) -> Result<CertifiedOrder, anyhow::Error> {
ensure!(
self.pending_transfer == None || self.pending_transfer.as_ref() == Some(&order),
self.store.get_pending_transfer()?.is_none()
Copy link
Contributor

@patrickkuo patrickkuo Jan 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably don't want to persist pending_tx, this field should be transient.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the client dies before this is complete or before it clears the pending state?
IIRC, operations are idempotent so it's okay to re-apply the same thing on wake-up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will actually cause a problem, pending transfer is unset when a transaction completed, there will be no way to unset it after restart, and will prevent client from starting any new transaction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the pending transfer returns the order it is pending on
So the client can simply retry the order and carry on, which is a one-time thing
How does this block future orders?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we have logic to re-submit pending transfer on restart then it will carry on, or else it will fail on this line, but the question is do we really want to do that automatically or let client decide how to recover from shutdown.

I think the main function of pending_transfer is to stop more than one transaction running at the same time to prevent race condition, that's why I think it's transient.

@oxade oxade closed this Jan 27, 2022
@oxade oxade deleted the feature/add-db-persistence-to-client branch January 28, 2022 01:20
mwtian pushed a commit that referenced this pull request Sep 12, 2022
- responds to an invalid serialized batch,
- closes #268
mwtian pushed a commit to mwtian/sui that referenced this pull request Sep 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants