Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Client State Persistence & Reloads (DO NOT MERGE) #270

Closed
wants to merge 1 commit into from

Conversation

oxade
Copy link
Contributor

@oxade oxade commented Jan 26, 2022

Persisting client state

TLDR:

1. How do we take advantage of storage to aid proper client reloads?

2. Do we persist all old objects?

3. We should persist all certificates, even for objects we no longer own, right?

Current client state variables

The following data structures from https://github.com/MystenLabs/fastnft/blob/main/fastpay_core/src/client.rs#L59 are used by the client but to support storage, they'll be modified and represented in some store: Arc<ClientStore>, similar on AuthorityStore https://github.com/MystenLabs/fastnft/blob/main/fastpay_core/src/authority/authority_store.rs

pub struct ClientState<AuthorityClient> {
    /// Our FastPay address.
    address: FastPayAddress,
    /// Our signature key.
    secret: KeyPair,
    /// Our FastPay committee.
    committee: Committee,
    /// How to talk to this committee.
    authority_clients: HashMap<AuthorityName, AuthorityClient>,
    /// Pending transfer.
    pending_transfer: Option<Order>,

    // The remaining fields are used to minimize networking, and may not always be persisted locally.
    /// Known certificates, indexed by TX digest.
    certificates: BTreeMap<TransactionDigest, CertifiedOrder>,
    /// The known objects with it's sequence number owned by the client.
    object_sequence_numbers: BTreeMap<ObjectID, SequenceNumber>,
    /// Confirmed objects with it's ref owned by the client.
    object_refs: BTreeMap<ObjectID, ObjectRef>,
    /// Certificate <-> object id linking map.
    object_certs: BTreeMap<ObjectID, Vec<TransactionDigest>>,
}

However with some changes

The following will be kept and persisted normally (as DBMap):

    address: FastPayAddress,
    secret: KeyPair,
    committee: Committee,
    authority_clients: HashMap<AuthorityName, AuthorityClient>,
    certificates: BTreeMap<TransactionDigest, CertifiedOrder>,
    object_refs: BTreeMap<ObjectID, ObjectRef>,
    object_certs: BTreeMap<ObjectID, Vec<TransactionDigest>>,

The following will be deleted:

object_sequence_numbers: BTreeMap<ObjectID, SequenceNumber> as it can be derived from object_refs: BTreeMap<ObjectID, ObjectRef>

The following will be added:

past_objects_map: DBMap<ObjectID, BTreeMap<SequenceNumber, Object>> which stores all versions of the old objects (similar object entries will be compacted)

latest_object_map: DBMap<ObjectID, Object> which stores the latest object (for lightweight retrieval)

The following will be changed

pending_transfer: Option<Order> should be changed to pending_orders: HashMap<ObjectID, Order> and should be persisted to prevent equivocation on crash & restart

What's the proper way of saving and reloading pending orders?

My thought process:

Upon every tx order, the client saves the ObjectId-Order pair to the pending_transfer map and clears it when finalized. While a flag is set for an object, the client cannot mutate the object. If the client crashes before clearing the flag, it can replay the order after restarting. This is safe due to idempotence.

What's wrong with this approach?

@oxade oxade changed the title Create dumm-file.md DO NOT MERGE: RFC Only Jan 26, 2022
@oxade
Copy link
Contributor Author

oxade commented Jan 26, 2022

@gdanezis I would love your input on this

@oxade oxade changed the title DO NOT MERGE: RFC Only RFC: Client State Persistence & Reloads (DO NOT MERGE) Jan 26, 2022
@gdanezis
Copy link
Collaborator

Hey, here are some thoughts:

object_certs: BTreeMap<ObjectID, Vec>

This object_certs is fishy. What is this list of TransactionDigests? Are you trying to re-create the concept of parent as in the authority structure parent_sync: DBMap<ObjectRef, TransactionDigest>?

object_refs: BTreeMap<ObjectID, ObjectRef>,
latest_object_map: DBMap<ObjectID, Object>

I think if you have latest_object_map then object_refs is redundant, since you can get the full ref from the object.

past_objects_map: DBMap<ObjectID, BTreeMap<SequenceNumber, Object>>

Why not simply keeping a structure objects : DBMap<ObjectRef, Object> which provides both the latest object and the previous versions if you still need them? Note that using the .skip_to()` on the iterator you can get to the last object in that structure.

pending_orders: HashMap<ObjectID, Order>

I think the approach you outline to avoid a client double spending an object by mistake is broadly sound. I would have a structure more like:

pending_orders: HashMap<OrderDigest, Order>
locked_objects: HashMap<ObjectRef, OrderDigest>

Then upon confirmation of an order (when the cert has been given to 2f+1 authorities) I would remove the Order from peding_orders and the corresponding objects from locked_objects.

@oxade
Copy link
Contributor Author

oxade commented Jan 26, 2022

Thanks @gdanezis , here's my comments in-line.

Hey, here are some thoughts:

object_certs: BTreeMap<ObjectID, Vec>

This object_certs is fishy. What is this list of TransactionDigests? Are you trying to re-create the concept of parent as in the authority structure parent_sync: DBMap<ObjectRef, TransactionDigest>?

I'm not sure of the full intent myself. @patrickkuo why was this added?

object_refs: BTreeMap<ObjectID, ObjectRef>,
latest_object_map: DBMap<ObjectID, Object>

I think if you have latest_object_map then object_refs is redundant, since you can get the full ref from the object.

Agreed but the problem is there can be a lag between when we receive a ref and when we fetch the actual object, since objects can be arbitrarily sized. This object_refs: BTreeMap<ObjectID, ObjectRef> allows us to not block on the object download before performing other operations.

past_objects_map: DBMap<ObjectID, BTreeMap<SequenceNumber, Object>>

Why not simply keeping a structure objects : DBMap<ObjectRef, Object> which provides both the latest object and the previous versions if you still need them? Note that using the .skip_to()` on the iterator you can get to the last object in that structure.

If this is indeed low cost in RocksDB, then it makes sense to use.
Also my initial intent was to be able to reduce space due to saving multiple copies of objects that dont change across sequence numbers and easily run binary search on BTreeMap<SequenceNumber, Object> to find the lower bound corresponding to a given seq#.

pending_orders: HashMap<ObjectID, Order>

I think the approach you outline to avoid a client double spending an object by mistake is broadly sound. I would have a structure more like:

pending_orders: HashMap<OrderDigest, Order>
locked_objects: HashMap<ObjectRef, OrderDigest>

Then upon confirmation of an order (when the cert has been given to 2f+1 authorities) I would remove the Order from peding_orders and the corresponding objects from locked_objects.

Why do I need two data structures here?
Why do we need to key by object ref? Shouldn't ObjectID suffice since we only work with the latest objects anyways?

@patrickkuo
Copy link
Contributor

I'm not sure of the full intent myself. @patrickkuo why was this added?

object_certs allows us to retrieve certificates chains of an object id. this mapping is needed because each certificate can have multiple output objects which shares the same cert, and we don't store the output of a tx anywhere else.

@lxfind
Copy link
Contributor

lxfind commented Jan 26, 2022

Related to this: Could we add detailed doc comment to each field of AuthorityStore?

```rust
pub struct ClientState<AuthorityClient> {
/// Our FastPay address.
address: FastPayAddress,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a client needs to manage multiple addresses. One way to represent this might be something like:

struct AddressState {
    pending_transactions: BTreeMap<ObjectID, Order>,
    certificates: BTreeMap<TransactionDigest, CertifiedOrder>,
    object_sequence_numbers: BTreeMap<ObjectID, SequenceNumber>,
    object_refs: BTreeMap<ObjectID, ObjectRef>,
    object_certs: BTreeMap<ObjectID, Vec<TransactionDigest>>,
}

and

struct ClientState {
  addresses: BTreeMap<FastPayAddress, AddressState>,
  // ... committee info, which is the same for all addresses
}
```

/// Our FastPay address.
address: FastPayAddress,
/// Our signature key.
secret: KeyPair,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The client state should know the public key for each of its addresses, but not the private key.

@gdanezis
Copy link
Collaborator

Related to this: Could we add detailed doc comment to each field of AuthorityStore?

@lxfind just added a quick PR with docs there:
#299

@oxade oxade closed this Feb 1, 2022
@huitseeker huitseeker deleted the rfc/client-state-persistence branch April 22, 2022 17:01
mwtian pushed a commit that referenced this pull request Sep 12, 2022
THis aligns the behavior of our Dag with that of the store

- adds the genesis by default to the Dag as part of the constructor,
- unit tests that initial provisioning,
- adapts the behavior of tests to not feed genesis manually.

Fixes #270
mwtian pushed a commit to mwtian/sui that referenced this pull request Sep 29, 2022
THis aligns the behavior of our Dag with that of the store

- adds the genesis by default to the Dag as part of the constructor,
- unit tests that initial provisioning,
- adapts the behavior of tests to not feed genesis manually.

Fixes MystenLabs#270
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants