-
Notifications
You must be signed in to change notification settings - Fork 11.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Batch Transaction #890
Conversation
sui_core/src/authority.rs
Outdated
transaction: &Transaction, | ||
) -> SuiResult<Vec<(InputObjectKind, Object)>> { | ||
let mut inputs = vec![]; | ||
for tx in transaction.single_transactions() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I know you said draft draft, but could not help. I think that the fact this is a transaction containing many transactions should be largely invisible to the authority code, besides the part that does execution. I think here we are going too deep into the semantics of the transaction. This should be hidden behind input_objects()
, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really. check_locks
is written for a single transaction, and it would be very hacky if we try to make it work for all input objects from a batch transaction. For example, it compares the gas object or the transfer object as special case. It also relies on the authenticator objects being in the same (single) transaction (I think even we have a batch transaction, each single transaction should still authenticate independently). For this reason, we need to iterate through all single transactions and check their locks individually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think even we have a batch transaction, each single transaction should still authenticate independently
Ok, now this worries me. A batch transaction is one transaction that has many execution components. It should be authenticated by one person, contain one signature, and use one gas object. It should have one vec of input objects, and one vec of shared objects, and will result in one effects with all the effects of all the executions amalgamated. It is charged gas as a whole. Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of these are correct, with one caveat. In the current definition of Transaction, we do not yet have the concept of authenticating objects (we should though, and if that's a priority, I can look into that first). So any object owned by another object at the moment is authenticated by the existence of another object arguments.
For example, say we have a batch of two Move Call transactions (and child object is owned by parent object):
Tx1: foo(child)
Tx2: bar(parent)
With the current structure of transactions, it would be strange if the above Batch authenticates successfully, i.e. that we are somehow using an object from Tx2 to authenticate an object from Tx1.
So I want to make sure the above Batch fails to authenticate.
If we want to fix this, then we need a dedicated field in a Tx to include all authenticating objects that may not show up in arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative is, I could pretend we have this today, and simply mix up all objects from the single-transactions to authenticate, expecting that at some point we will add that field anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
. It should have one vec of input objects, and one vec of shared objects, and will result in one effects with all the effects of all the executions amalgamated.
I think this is ok/preferred. The argument I used to convince myself is that if you have a batch of I
transactions with input objects o_1, ..., o_N
, this is semantically equivalent to publishing + calling a new, single entrypoint:
main(o1, ..., o_N, ctx: &mut TxContext) {
f_1(o_1, ...);
...
f_i(..., o_N);
}
Note that here, f_1
could be Xun's foo(child)
and f_2
could be Xun's bar(parent)
, so I don't think batches with a vec of amalgamated input objects gives you any more power. That is, I think we should allow foo(child); bar(parent)
to be authenticated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to fix this, then we need a dedicated field in a Tx to include all authenticating objects that may not show up in arguments.
I do think that we should eventually do this (as discussed somewhere else that I can't find to link), but unless I misunderstand, this is orthogonal to batches. I do, however, see how mixing this feature and batches could cause confusion about how to match the big vec of input objects against the arguments of each entrypoint (though I'm sure we can figure it out).
0b28dbb
to
4f85524
Compare
sui_core/src/authority.rs
Outdated
// TODO: For now, we just use the first gas object as the gas object in the returned effects. | ||
// We should return a list of updated gas objects. | ||
let gas_object_id = single_transactions[0].gas_payment_object_ref().0; | ||
let mut responses: futures::stream::FuturesUnordered<_> = (0..single_transactions.len() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My advice: write here the most straight forward loop, and optimize for clarity at this point. We will work out how to do this execution in an optimised way down the line. It will involve a single multi_get from the DB etc. This model will try to do many parallel reads into the DB potentially lowering perf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if the outputs of one transaction is needed by the next, then they can't execute in parallel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed with not doing parallel execution here. I don't think it will produce the same result as sequential execution if/when we have batches with shared objects (which is maybe Evan's point)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment, there is a constraint where a mutable object (shared or not) cannot appear more than once among the single transactions in the same batch, which means parallel execution is ok.
For owned mutable objects, it's clear we don't want any object to show up in more than one single transactions, because there is no way (or at least very difficult) to pre-specify their object refs after the first appearance.
So I think you are suggesting we want to allow shared mutable objects to show up more than once? Why do we want that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not suggesting this--I misunderstood the policy on shared mutable objects in batches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the simplicity of not initially allowing more than one invocation of a move tx per shared object within a batch.
However, down the line: we can allow that since the batch contains an ordering, than if two transactions operate on an object, the first one goes first, then the second one takes as input the output of the first one. This happens on the execution layer. The system later still sees the whole batch as a blackbox, where a shared object goes in, and then the shared object with version +1 goes out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, I didn't know that there was a restriction of each mutable object only appearing once. However, I also see that the entire batch is supposed to all succeed or none succeed (thus rollbacks). I'm wondering.... why the atomic / batch behaviour if they are independent transactions? Usually the use case for atomic logic is to ensure that transactions that have dependencies between themselves all go in. Are we thinking independent transactions which are logically connected somehow? Just trying to understand the use case.
I also realized with @gdanezis 's scenario of transactions that depend on one another:
o_1 --> Tx1 --> o_2 --> Tx2 --> o_3
In the example above, Tx2 operates on the output of Tx1. However, we have no way of specifying that today, because Tx2 cannot know the ObjRef of the output of Tx1 in advance, I think this is what @lxfind means. We would need a different way of specifying the input. Something like this:
pub enum TransactionInput {
TI_ObjRef(ObjectRef),
TI_PrevTxOutput(TransactionDigest),
}
where you can specify that the input you want is the output of some prev tx.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great question! It's atomic / batch behavior at the moment purely for implementation simplicity.
To make them parallel this can get really complicated: we will need to be able to merge their effects, we will need to track a sub-transaction id in each TxContext (so that we could still create unique object IDs), a lot of restructuring of the code and etc.
I don't think there is any fundamental reason that we cannot do this, just didn't go that route for now.
sui_core/src/authority.rs
Outdated
transaction: &Transaction, | ||
) -> SuiResult<Vec<(InputObjectKind, Object)>> { | ||
let mut inputs = vec![]; | ||
for tx in transaction.single_transactions() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think even we have a batch transaction, each single transaction should still authenticate independently
Ok, now this worries me. A batch transaction is one transaction that has many execution components. It should be authenticated by one person, contain one signature, and use one gas object. It should have one vec of input objects, and one vec of shared objects, and will result in one effects with all the effects of all the executions amalgamated. It is charged gas as a whole. Am I missing something?
I did not mean to approve but only comment :). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow this is a big change. I'm wondering, what is the overall motivation, is it to make processing a whole bunch of transactions more efficient?
Need to think about the implications from storage side.
sui_core/src/authority.rs
Outdated
// TODO: For now, we just use the first gas object as the gas object in the returned effects. | ||
// We should return a list of updated gas objects. | ||
let gas_object_id = single_transactions[0].gas_payment_object_ref().0; | ||
let mut responses: futures::stream::FuturesUnordered<_> = (0..single_transactions.len() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if the outputs of one transaction is needed by the next, then they can't execute in parallel?
@gdanezis @sblackshear I have rewritten this PR in the way based on our discussion. Also updated the PR description. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is what we need. Eventually we will go in and optimise a lot of the details, but right now the priority is to have the semantics there + a lot of tests to ensure we do not break things when we optimize.
Can we have a few tests at least to exercise both happy and unhappy cases? I rely on these heavily to ensure no errors are introduced.
sui_core/src/authority.rs
Outdated
match tx { | ||
SingleTransactionKind::Transfer(_) => { | ||
// Index access safe because the inputs were constructed in order. | ||
let trransfer_object = &inputs[0].1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now this is the only part of this function that requires an input object besides the gas object. It would be a major win if we can get rid of the requirement to have all objects to check the transfer requirement. If we can go this, then we could execute this check before we check for signatures.
This would allow us to shore up out DoS defences: we only need to do a read on the gas object balance before we do anything expensive, such as checking signatures. Maybe this is for a separate issue and PR,
.collect(); | ||
|
||
let mut transaction_dependencies: BTreeSet<_> = inputs | ||
let mut transaction_dependencies: BTreeSet<_> = objects_by_kind |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Own note: can we live here without a Btree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not, transaction_dependencies will be a Vec eventually stored in the signed effects. So it needs deterministic ordering.
return Ok(ExecutionStatus::new_failure(total_gas, *error)); | ||
} | ||
ExecutionStatus::Success { gas_used, results } => { | ||
last_results = results; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is something line inner_results? A vector of results for each of the parts?
sui_core/src/execution_engine.rs
Outdated
let mut last_results = vec![]; | ||
// TODO: Since we require all mutable objects do not show up more than | ||
// once across single tx, we should be able to run them in parallel. | ||
for (single_tx, inputs) in transaction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need transaction after that? Or can we own it, deconstruct it, and then avoid the clones down the line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to store the entire CertifiedTransaction when we update the db, so yes we need the Transaction in there.
@gdanezis Thanks for the review! Some of your questions should be answered in the PR's updated description, please have a look too. |
c24142d
to
2572424
Compare
Can we live without supporting this in the batch type? |
Maybe? It's probably rare if people need to do batch publishing. |
aa322cb
to
7d094bf
Compare
de1b281
to
ea8278b
Compare
* chore(ci): add script to update fastcrypto * chore(ci): add auto-update job for fastcrypto * chore(ci): update mysten-infra too * fix: only update mysten-infra Fastcrypto to be released through crates.io & updated through dependabot
* chore(ci): add script to update fastcrypto * chore(ci): add auto-update job for fastcrypto * chore(ci): update mysten-infra too * fix: only update mysten-infra Fastcrypto to be released through crates.io & updated through dependabot
This PR aims to solve #876, supporting batch transaction in order to improve throughput.
To review this change, please start from message.rs. It introduced a few major changes in message.rs: