Support Batch Transaction #890

lxfind · 2022-03-17T05:05:15Z

This PR aims to solve #876, supporting batch transaction in order to improve throughput.
To review this change, please start from message.rs. It introduced a few major changes in message.rs:

Previously a TransactionKind can be Transfer, Call and Publish. Now TransactionKind can be either a Single and Batch, where Single is a SingleTransactionKind enum, and Batch is a vector of SingleTransactionKind. SingleTransactionKind contains the 3 variants we are familiar with: Transfer, Call and Publish.
Many of the functions implemented on TransactionKind now are implemented both on SingleTransactionKind, and also on TransactionKind to deal with the Batch.
In order to mix all input objects from each single transaction to one vector, we need to know exactly which objects belong to which single transaction. Publish transaction doesn't have deterministic input object count, hence we do not allow publish to show up in a batch transaction. This allows us to walk through the input object list for each single transaction.
When executing the transactions in a batch, we keep using the same temporary store and the same TxContext. If any of them returns an ExecutionStatus::Failure, we roll back the entire transaction, only charge gas and increment version, return Failure. If all single transactions succeed, we commit the actual changes to the store.
Added dedicated unit test file
One nice optimization in initializing TemporaryAuthorityStore: we no longer need to recompute the object reference for each object. The input kind already has it for all owned objects.

gdanezis · 2022-03-17T15:52:36Z

sui_core/src/authority.rs

+        transaction: &Transaction,
+    ) -> SuiResult<Vec<(InputObjectKind, Object)>> {
+        let mut inputs = vec![];
+        for tx in transaction.single_transactions() {


Hm, I know you said draft draft, but could not help. I think that the fact this is a transaction containing many transactions should be largely invisible to the authority code, besides the part that does execution. I think here we are going too deep into the semantics of the transaction. This should be hidden behind input_objects(), no?

Not really. check_locks is written for a single transaction, and it would be very hacky if we try to make it work for all input objects from a batch transaction. For example, it compares the gas object or the transfer object as special case. It also relies on the authenticator objects being in the same (single) transaction (I think even we have a batch transaction, each single transaction should still authenticate independently). For this reason, we need to iterate through all single transactions and check their locks individually.

I think even we have a batch transaction, each single transaction should still authenticate independently

Ok, now this worries me. A batch transaction is one transaction that has many execution components. It should be authenticated by one person, contain one signature, and use one gas object. It should have one vec of input objects, and one vec of shared objects, and will result in one effects with all the effects of all the executions amalgamated. It is charged gas as a whole. Am I missing something?

Most of these are correct, with one caveat. In the current definition of Transaction, we do not yet have the concept of authenticating objects (we should though, and if that's a priority, I can look into that first). So any object owned by another object at the moment is authenticated by the existence of another object arguments.
For example, say we have a batch of two Move Call transactions (and child object is owned by parent object):
Tx1: foo(child)
Tx2: bar(parent)

With the current structure of transactions, it would be strange if the above Batch authenticates successfully, i.e. that we are somehow using an object from Tx2 to authenticate an object from Tx1.
So I want to make sure the above Batch fails to authenticate.

If we want to fix this, then we need a dedicated field in a Tx to include all authenticating objects that may not show up in arguments.

An alternative is, I could pretend we have this today, and simply mix up all objects from the single-transactions to authenticate, expecting that at some point we will add that field anyway.

. It should have one vec of input objects, and one vec of shared objects, and will result in one effects with all the effects of all the executions amalgamated.

I think this is ok/preferred. The argument I used to convince myself is that if you have a batch of I transactions with input objects o_1, ..., o_N, this is semantically equivalent to publishing + calling a new, single entrypoint:

main(o1, ..., o_N, ctx: &mut TxContext) { f_1(o_1, ...); ... f_i(..., o_N); }

Note that here, f_1 could be Xun's foo(child) and f_2 could be Xun's bar(parent), so I don't think batches with a vec of amalgamated input objects gives you any more power. That is, I think we should allow foo(child); bar(parent) to be authenticated.

If we want to fix this, then we need a dedicated field in a Tx to include all authenticating objects that may not show up in arguments.

I do think that we should eventually do this (as discussed somewhere else that I can't find to link), but unless I misunderstand, this is orthogonal to batches. I do, however, see how mixing this feature and batches could cause confusion about how to match the big vec of input objects against the arguments of each entrypoint (though I'm sure we can figure it out).

sui_core/src/authority.rs

gdanezis · 2022-03-17T20:28:16Z

sui_core/src/authority.rs

+        // TODO: For now, we just use the first gas object as the gas object in the returned effects.
+        // We should return a list of updated gas objects.
+        let gas_object_id = single_transactions[0].gas_payment_object_ref().0;
+        let mut responses: futures::stream::FuturesUnordered<_> = (0..single_transactions.len()


My advice: write here the most straight forward loop, and optimize for clarity at this point. We will work out how to do this execution in an optimised way down the line. It will involve a single multi_get from the DB etc. This model will try to do many parallel reads into the DB potentially lowering perf.

What happens if the outputs of one transaction is needed by the next, then they can't execute in parallel?

Agreed with not doing parallel execution here. I don't think it will produce the same result as sequential execution if/when we have batches with shared objects (which is maybe Evan's point)

At the moment, there is a constraint where a mutable object (shared or not) cannot appear more than once among the single transactions in the same batch, which means parallel execution is ok.
For owned mutable objects, it's clear we don't want any object to show up in more than one single transactions, because there is no way (or at least very difficult) to pre-specify their object refs after the first appearance.
So I think you are suggesting we want to allow shared mutable objects to show up more than once? Why do we want that?

I am not suggesting this--I misunderstood the policy on shared mutable objects in batches.

I like the simplicity of not initially allowing more than one invocation of a move tx per shared object within a batch.

However, down the line: we can allow that since the batch contains an ordering, than if two transactions operate on an object, the first one goes first, then the second one takes as input the output of the first one. This happens on the execution layer. The system later still sees the whole batch as a blackbox, where a shared object goes in, and then the shared object with version +1 goes out.

I see, I didn't know that there was a restriction of each mutable object only appearing once. However, I also see that the entire batch is supposed to all succeed or none succeed (thus rollbacks). I'm wondering.... why the atomic / batch behaviour if they are independent transactions? Usually the use case for atomic logic is to ensure that transactions that have dependencies between themselves all go in. Are we thinking independent transactions which are logically connected somehow? Just trying to understand the use case.

I also realized with @gdanezis 's scenario of transactions that depend on one another:

o_1 --> Tx1 --> o_2 --> Tx2 --> o_3

In the example above, Tx2 operates on the output of Tx1. However, we have no way of specifying that today, because Tx2 cannot know the ObjRef of the output of Tx1 in advance, I think this is what @lxfind means. We would need a different way of specifying the input. Something like this:

pub enum TransactionInput { TI_ObjRef(ObjectRef), TI_PrevTxOutput(TransactionDigest), }

where you can specify that the input you want is the output of some prev tx.

Great question! It's atomic / batch behavior at the moment purely for implementation simplicity.
To make them parallel this can get really complicated: we will need to be able to merge their effects, we will need to track a sub-transaction id in each TxContext (so that we could still create unique object IDs), a lot of restructuring of the code and etc.
I don't think there is any fundamental reason that we cannot do this, just didn't go that route for now.

gdanezis · 2022-03-17T20:31:18Z

sui_core/src/authority.rs

+        transaction: &Transaction,
+    ) -> SuiResult<Vec<(InputObjectKind, Object)>> {
+        let mut inputs = vec![];
+        for tx in transaction.single_transactions() {


I think even we have a batch transaction, each single transaction should still authenticate independently

Ok, now this worries me. A batch transaction is one transaction that has many execution components. It should be authenticated by one person, contain one signature, and use one gas object. It should have one vec of input objects, and one vec of shared objects, and will result in one effects with all the effects of all the executions amalgamated. It is charged gas as a whole. Am I missing something?

gdanezis · 2022-03-17T20:32:48Z

I did not mean to approve but only comment :).

velvia

Wow this is a big change. I'm wondering, what is the overall motivation, is it to make processing a whole bunch of transactions more efficient?
Need to think about the implications from storage side.

sui_core/src/authority.rs

velvia · 2022-03-17T22:16:21Z

sui_core/src/authority.rs

+        // TODO: For now, we just use the first gas object as the gas object in the returned effects.
+        // We should return a list of updated gas objects.
+        let gas_object_id = single_transactions[0].gas_payment_object_ref().0;
+        let mut responses: futures::stream::FuturesUnordered<_> = (0..single_transactions.len()


What happens if the outputs of one transaction is needed by the next, then they can't execute in parallel?

lxfind · 2022-03-18T07:06:58Z

@gdanezis @sblackshear I have rewritten this PR in the way based on our discussion. Also updated the PR description.
Note that this is still a draft in that I haven't polished it and tests are failing. But the major changes should be all there now.

gdanezis

Yes, this is what we need. Eventually we will go in and optimise a lot of the details, but right now the priority is to have the semantics there + a lot of tests to ensure we do not break things when we optimize.

Can we have a few tests at least to exercise both happy and unhappy cases? I rely on these heavily to ensure no errors are introduced.

sui_core/src/authority.rs

gdanezis · 2022-03-18T16:00:20Z

sui_core/src/authority.rs

+            match tx {
+                SingleTransactionKind::Transfer(_) => {
+                    // Index access safe because the inputs were constructed in order.
+                    let trransfer_object = &inputs[0].1;


Now this is the only part of this function that requires an input object besides the gas object. It would be a major win if we can get rid of the requirement to have all objects to check the transfer requirement. If we can go this, then we could execute this check before we check for signatures.

This would allow us to shore up out DoS defences: we only need to do a read on the gas object balance before we do anything expensive, such as checking signatures. Maybe this is for a separate issue and PR,

sui_core/src/authority.rs

gdanezis · 2022-03-18T16:14:54Z

sui_core/src/authority.rs

-            .collect();
-
-        let mut transaction_dependencies: BTreeSet<_> = inputs
+        let mut transaction_dependencies: BTreeSet<_> = objects_by_kind


Own note: can we live here without a Btree?

Probably not, transaction_dependencies will be a Vec eventually stored in the signed effects. So it needs deterministic ordering.

sui_core/src/execution_engine.rs

gdanezis · 2022-03-18T16:25:06Z

sui_core/src/execution_engine.rs

+                return Ok(ExecutionStatus::new_failure(total_gas, *error));
+            }
+            ExecutionStatus::Success { gas_used, results } => {
+                last_results = results;


I guess this is something line inner_results? A vector of results for each of the parts?

sui_core/src/execution_engine.rs

gdanezis · 2022-03-18T16:28:36Z

sui_core/src/execution_engine.rs

+    let mut last_results = vec![];
+    // TODO: Since we require all mutable objects do not show up more than
+    // once across single tx, we should be able to run them in parallel.
+    for (single_tx, inputs) in transaction


Do we need transaction after that? Or can we own it, deconstruct it, and then avoid the clones down the line?

We need to store the entire CertifiedTransaction when we update the db, so yes we need the Transaction in there.

sui_types/src/messages.rs

lxfind · 2022-03-18T16:37:20Z

@gdanezis Thanks for the review! Some of your questions should be answered in the PR's updated description, please have a look too.

gdanezis · 2022-03-18T20:21:32Z

We cannot statically know the number of objects from each single transaction due to Publish kind

Can we live without supporting this in the batch type?

lxfind · 2022-03-18T20:24:54Z

We cannot statically know the number of objects from each single transaction due to Publish kind

Can we live without supporting this in the batch type?

Maybe? It's probably rare if people need to do batch publishing.

* chore(ci): add script to update fastcrypto * chore(ci): add auto-update job for fastcrypto * chore(ci): update mysten-infra too * fix: only update mysten-infra Fastcrypto to be released through crates.io & updated through dependabot

gdanezis reviewed Mar 17, 2022

View reviewed changes

sui_core/src/authority.rs Outdated Show resolved Hide resolved

lxfind force-pushed the batch-tx branch 6 times, most recently from 0b28dbb to 4f85524 Compare March 17, 2022 20:21

gdanezis approved these changes Mar 17, 2022

View reviewed changes

velvia reviewed Mar 17, 2022

View reviewed changes

lxfind force-pushed the batch-tx branch from 4f85524 to 7ac37b8 Compare March 18, 2022 06:43

lxfind changed the base branch from main to check-transfer-gas-upfront March 18, 2022 06:43

lxfind requested a review from gdanezis March 18, 2022 07:05

gdanezis approved these changes Mar 18, 2022

View reviewed changes

lxfind force-pushed the check-transfer-gas-upfront branch 3 times, most recently from c24142d to 2572424 Compare March 18, 2022 17:32

Base automatically changed from check-transfer-gas-upfront to main March 18, 2022 17:54

lxfind force-pushed the batch-tx branch from 7ac37b8 to aa4d618 Compare March 18, 2022 20:20

lxfind force-pushed the batch-tx branch from aa4d618 to d000a25 Compare March 18, 2022 22:20

lxfind changed the title ~~[Draft] Support Batch Transaction~~ Support Batch Transaction Mar 18, 2022

lxfind marked this pull request as ready for review March 18, 2022 22:20

lxfind force-pushed the batch-tx branch 3 times, most recently from aa322cb to 7d094bf Compare March 19, 2022 01:34

lxfind changed the base branch from main to fix-bench-invariant March 19, 2022 01:35

lxfind force-pushed the fix-bench-invariant branch from de1b281 to ea8278b Compare March 19, 2022 04:09

lxfind force-pushed the batch-tx branch from 7d094bf to 5871494 Compare March 19, 2022 04:09

Support Batch Transaction

990e351

lxfind force-pushed the batch-tx branch from 5871494 to 990e351 Compare March 19, 2022 18:20

lxfind changed the base branch from fix-bench-invariant to main March 19, 2022 18:27

lxfind merged commit f7f1a30 into main Mar 19, 2022

lxfind deleted the batch-tx branch March 19, 2022 19:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Batch Transaction #890

Support Batch Transaction #890

lxfind commented Mar 17, 2022 •

edited

Loading

gdanezis Mar 17, 2022

lxfind Mar 17, 2022

gdanezis Mar 17, 2022 •

edited

Loading

lxfind Mar 17, 2022

lxfind Mar 17, 2022

sblackshear Mar 17, 2022

sblackshear Mar 17, 2022

gdanezis Mar 17, 2022

velvia Mar 17, 2022

sblackshear Mar 17, 2022

lxfind Mar 17, 2022

sblackshear Mar 18, 2022

gdanezis Mar 18, 2022

velvia Mar 18, 2022

lxfind Mar 19, 2022

gdanezis Mar 17, 2022 •

edited

Loading

gdanezis commented Mar 17, 2022

velvia left a comment

velvia Mar 17, 2022

lxfind commented Mar 18, 2022

gdanezis left a comment

gdanezis Mar 18, 2022

gdanezis Mar 18, 2022

lxfind Mar 18, 2022

gdanezis Mar 18, 2022

gdanezis Mar 18, 2022

lxfind Mar 18, 2022

lxfind commented Mar 18, 2022

gdanezis commented Mar 18, 2022

lxfind commented Mar 18, 2022

Support Batch Transaction #890

Support Batch Transaction #890

Conversation

lxfind commented Mar 17, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gdanezis Mar 17, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gdanezis Mar 17, 2022 • edited Loading

Choose a reason for hiding this comment

gdanezis commented Mar 17, 2022

velvia left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lxfind commented Mar 18, 2022

gdanezis left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lxfind commented Mar 18, 2022

gdanezis commented Mar 18, 2022

lxfind commented Mar 18, 2022

lxfind commented Mar 17, 2022 •

edited

Loading

gdanezis Mar 17, 2022 •

edited

Loading

gdanezis Mar 17, 2022 •

edited

Loading