Skip to content
This repository was archived by the owner on Nov 15, 2023. It is now read-only.

Commit 550d64c

Browse files
tomusdrwtomakagilescope
authored
Transaction Pool docs (#9056)
* Add transaction pool docs. * Extra docs. * Apply suggestions from code review Co-authored-by: Pierre Krieger <pierre.krieger1708@gmail.com> * Expand on some review comments. * Update README.md Fixed typos / spellings Co-authored-by: Pierre Krieger <pierre.krieger1708@gmail.com> Co-authored-by: Squirrel <gilescope@gmail.com>
1 parent 7170fda commit 550d64c

File tree

2 files changed

+367
-1
lines changed

2 files changed

+367
-1
lines changed

.editorconfig

+5
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,11 @@ trim_trailing_whitespace=true
99
max_line_length=100
1010
insert_final_newline=true
1111

12+
[*.md]
13+
max_line_length=80
14+
indent_style=space
15+
indent_size=2
16+
1217
[*.yml]
1318
indent_style=space
1419
indent_size=2

client/transaction-pool/README.md

+362-1
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,364 @@
11
Substrate transaction pool implementation.
22

3-
License: GPL-3.0-or-later WITH Classpath-exception-2.0
3+
License: GPL-3.0-or-later WITH Classpath-exception-2.0
4+
5+
# Problem Statement
6+
7+
The transaction pool is responsible for maintaining a set of transactions that
8+
possible to include by block authors in upcoming blocks. Transactions are received
9+
either from networking (gossiped by other peers) or RPC (submitted locally).
10+
11+
The main task of the pool is to prepare an ordered list of transactions for block
12+
authorship module. The same list is useful for gossiping to other peers, but note
13+
that it's not a hard requirement for the gossiped transactions to be exactly the
14+
same (see implementation notes below).
15+
16+
It's within block author incentives to have the transactions stored and ordered in
17+
such a way to:
18+
19+
1. Maximize block author's profits (value of the produced block)
20+
2. Minimize block author's amount of work (time to produce block)
21+
22+
In the case of FRAME the first property is simply making sure that the fee per weight
23+
unit is the highest (high `tip` values), the second is about avoiding feeding
24+
transactions that cannot be part of the next block (they are invalid, obsolete, etc).
25+
26+
From the transaction pool PoV, transactions are simply opaque blob of bytes,
27+
it's required to query the runtime (via `TaggedTransactionQueue` Runtime API) to
28+
verify transaction's mere correctness and extract any information about how the
29+
transaction relates to other transactions in the pool and current on-chain state.
30+
Only valid transactions should be stored in the pool.
31+
32+
Each imported block can affect validity of transactions already in the pool. Block
33+
authors expect from the pool to get most up to date information about transactions
34+
that can be included in the block that they are going to build on top of the just
35+
imported one. The process of ensuring this property is called *pruning*. During
36+
pruning the pool should remove transactions which are considered invalid by the
37+
runtime (queried at current best imported block).
38+
39+
Since the blockchain is not always linear, forks need to be correctly handled by
40+
the transaction pool as well. In case of a fork, some blocks are *retracted*
41+
from the canonical chain, and some other blocks get *enacted* on top of some
42+
common ancestor. The transactions from retrated blocks could simply be discarded,
43+
but it's desirable to make sure they are still considered for inclusion in case they
44+
are deemed valid by the runtime state at best, recently enacted block (fork the
45+
chain re-organized to).
46+
47+
Transaction pool should also offer a way of tracking transaction lifecycle in the
48+
pool, it's broadcasting status, block inclusion, finality, etc.
49+
50+
## Transaction Validity details
51+
52+
Information retrieved from the the runtime are encapsulated in `TransactionValidity`
53+
type.
54+
55+
```rust
56+
pub type TransactionValidity = Result<ValidTransaction, TransactionValidityError>;
57+
58+
pub struct ValidTransaction {
59+
pub requires: Vec<TransactionTag>,
60+
pub provides: Vec<TransactionTag>,
61+
pub priority: TransactionPriority,
62+
pub longevity: TransactionLongevity,
63+
pub propagate: bool,
64+
}
65+
66+
pub enum TransactionValidityError {
67+
Invalid(/* details */),
68+
Unknown(/* details */),
69+
}
70+
```
71+
72+
We will go through each of the parameter now to understand the requirements they
73+
create for transaction ordering.
74+
75+
The runtime is expected to return these values in a deterministic fashion. Calling
76+
the API multiple times given exactly the same state must return same results.
77+
Field-specific rules are described below.
78+
79+
### `requires` / `provides`
80+
81+
These two fields contain a set of `TransactionTag`s (opaque blobs) associated with
82+
given transaction. Looking at these fields we can find dependencies between
83+
transactions and their readiness for block inclusion.
84+
85+
The `provides` set contains properties that will be *satisfied* in case the transaction
86+
is successfully added to a block. `requires` contains properties that must be satisfied
87+
**before** the transaction can be included to a block.
88+
89+
Note that a transaction with empty `requires` set can be added to a block immediately,
90+
there are no other transactions that it expects to be included before.
91+
92+
For some given series of transactions the `provides` and `requires` fields will create
93+
a (simple) directed acyclic graph. The *sources* in such graph, if they don't have
94+
any extra `requires` tags (i.e. they have their all dependencies *satisfied*), should
95+
be considered for block inclusion first. Multiple transactions that are ready for
96+
block inclusion should be ordered by `priority` (see below).
97+
98+
Note the process of including transactions to a block is basically building the graph,
99+
then selecting "the best" source vertex (transaction) with all tags satisfied and
100+
removing it from that graph.
101+
102+
#### Examples
103+
104+
- A transaction in Bitcoin-like chain will `provide` generated UTXOs and will `require`
105+
UTXOs it is still awaiting for (note that it's not necessarily all require inputs,
106+
since some of them might already be spendable (i.e. the UTXO is in state))
107+
108+
- A transaction in account-based chain will `provide` a `(sender, transaction_index/nonce)`
109+
(as one tag), and will `require` `(sender, nonce - 1)` in case
110+
`on_chain_nonce < nonce - 1`.
111+
112+
#### Rules & caveats
113+
114+
- `provides` must not be empty
115+
- transactions with an overlap in `provides` tags are mutually exclusive
116+
- checking validity of transaction that `requires` tag `A` after including
117+
transaction that provides that tag must not return `A` in `requires` again
118+
- runtime developers should avoid re-using `provides` tag (i.e. it should be unique)
119+
- there should be no cycles in transaction dependencies
120+
- caveat: on-chain state conditions may render transaction invalid despite no
121+
`requires` tags
122+
- caveat: on-chain state conditions may render transaction valid despite some
123+
`requires` tags
124+
- caveat: including transactions to a chain might make them valid again right away
125+
(for instance UTXO transaction gets in, but since we don't store spent outputs
126+
it will be valid again, awaiting the same inputs/tags to be satisfied)
127+
128+
### `priority`
129+
130+
Transaction priority describes importance of the transaction relative to other transactions
131+
in the pool. Block authors can expect benefiting from including such transactions
132+
before others.
133+
134+
Note that we can't simply order transactions in the pool by `priority`, cause first
135+
we need to make sure that all of the transaction requirements are satisfied (see
136+
`requires/provides` section). However if we consider a set of transactions
137+
which all have their requirements (tags) satisfied, the block author should be
138+
choosing the ones with highest priority to include to the next block first.
139+
140+
`priority` can be any number between `0` (lowest inclusion priority) to `u64::MAX`
141+
(highest inclusion priority).
142+
143+
#### Rules & caveats
144+
145+
- `priority` of transaction may change over time
146+
- on-chain conditions may affect `priority`
147+
- Given two transactions with overlapping `provides` tags, the one with higher
148+
`priority` should be preferred. However we can also look at the total priority
149+
of a subtree rooted at that transaction and compare that instead (i.e. even though
150+
the transaction itself has lower `priority` it "unlocks" other high priority transactions).
151+
152+
### `longevity`
153+
154+
Longevity describes how long (in blocks) the transaction is expected to be
155+
valid. This parameter only gives a hint to the transaction pool how long
156+
current transaction may still be valid. Note that it does not guarantee
157+
the transaction is valid all that time though.
158+
159+
#### Rules & caveats
160+
161+
- `longevity` of transaction may change over time
162+
- on-chain conditions may affect `longevity`
163+
- After `longevity` lapses the transaction may still be valid
164+
165+
### `propagate`
166+
167+
This parameter instructs the pool propagate/gossip a transaction to node peers.
168+
By default this should be `true`, however in some cases it might be undesirable
169+
to propagate transactions further. Examples might include heavy transactions
170+
produced by block authors in offchain workers (DoS) or risking being front
171+
runned by someone else after finding some non trivial solution or equivocation,
172+
etc.
173+
174+
### 'TransactionSource`
175+
176+
To make it possible for the runtime to distinguish if the transaction that is
177+
being validated was received over the network or submitted using local RPC or
178+
maybe it's simply part of a block that is being imported, the transaction pool
179+
should pass additional `TransactionSource` parameter to the validity function
180+
runtime call.
181+
182+
This can be used by runtime developers to quickly reject transactions that for
183+
instance are not expected to be gossiped in the network.
184+
185+
186+
### `Invalid` transaction
187+
188+
In case the runtime returns an `Invalid` error it means the transaction cannot
189+
be added to a block at all. Extracting the actual reason of invalidity gives
190+
more details about the source. For instance `Stale` transaction just indicates
191+
the transaction was already included in a block, while `BadProof` signifies
192+
invalid signature.
193+
Invalidity might also be temporary. In case of `ExhaustsResources` the
194+
transaction does not fit to the current block, but it might be okay for the next
195+
one.
196+
197+
### `Unknown` transaction
198+
199+
In case of `Unknown` validity, the runtime cannot determine if the transaction
200+
is valid or not in current block. However this situation might be temporary, so
201+
it is expected for the transaction to be retried in the future.
202+
203+
# Implementation
204+
205+
An ideal transaction pool should be storing only transactions that are considered
206+
valid by the runtime at current best imported block.
207+
After every block is imported, the pool should:
208+
209+
1. Revalidate all transactions in the pool and remove the invalid ones.
210+
1. Construct the transaction inclusion graph based on `provides/requires` tags.
211+
Some transactions might not be reachable (have unsatisfied dependencies),
212+
they should be just left out in the pool.
213+
1. On block author request, the graph should be copied and transactions should
214+
be removed one-by-one from the graph starting from the one with highest
215+
priority and all conditions satisfied.
216+
217+
With current gossip protocol, networking should propagate transactions in the
218+
same order as block author would include them. Most likely it's fine if we
219+
propagate transactions with cumulative weight not exceeding upcoming `N`
220+
blocks (choosing `N` is subject to networking conditions and block times).
221+
222+
Note that it's not a strict requirement though to propagate exactly the same
223+
transactions that are prepared for block inclusion. Propagation is best
224+
effort, especially for block authors and is not directly incentivised.
225+
However the networking protocol might penalise peers that send invalid or
226+
useless transactions so we should be nice to others. Also see below a proposal
227+
to instead of gossiping everyting have other peers request transactions they
228+
are interested in.
229+
230+
Since the pool is expected to store more transactions than what can fit
231+
to a single block. Validating the entire pool on every block might not be
232+
feasible, so the actual implementation might need to take some shortcuts.
233+
234+
## Suggestions & caveats
235+
236+
1. The validity of transaction should not change significantly from block to
237+
block. I.e. changes in validity should happen predicatably, e.g. `longevity`
238+
decrements by 1, `priority` stays the same, `requires` changes if transaction
239+
that provided a tag was included in block. `provides` does not change, etc.
240+
241+
1. That means we don't have to revalidate every transaction after every block
242+
import, but we need to take care of removing potentially stale transactions.
243+
244+
1. Transactions with exactly the same bytes are most likely going to give the
245+
same validity results. We can essentially treat them as identical.
246+
247+
1. Watch out for re-organisations and re-importing transactions from retracted
248+
blocks.
249+
250+
1. In the past there were many issues found when running small networks with a
251+
lot of re-orgs. Make sure that transactions are never lost.
252+
253+
1. UTXO model is quite challenging. The transaction becomes valid right after
254+
it's included in block, however it is waiting for exactly the same inputs to
255+
be spent, so it will never really be included again.
256+
257+
1. Note that in a non-ideal implementation the state of the pool will most
258+
likely always be a bit off, i.e. some transactions might be still in the pool,
259+
but they are invalid. The hard decision is about trade-offs you take.
260+
261+
1. Note that import notification is not reliable - you might not receive a
262+
notification about every imported block.
263+
264+
## Potential implementation ideas
265+
266+
1. Block authors remove transactions from the pool when they author a block. We
267+
still store them around to re-import in case the block does not end up
268+
canonical. This only works if the block is actively authoring blocks (also
269+
see below).
270+
271+
1. We don't prune, but rather remove a fixed amount of transactions from the front
272+
of the pool (number based on average/max transactions per block from the
273+
past) and re-validate them, reimporting the ones that are still valid.
274+
275+
1. We periodically validate all transactions in the pool in batches.
276+
277+
1. To minimize runtime calls, we introduce batch-verify call. Note it should reset
278+
the state (overlay) after every verification.
279+
280+
1. Consider leveraging finality. Maybe we could verify against latest finalised
281+
block instead. With this the pool in different nodes can be more similar
282+
which might help with gossiping (see set reconciliation). Note that finality
283+
is not a strict requirement for a Substrate chain to have though.
284+
285+
1. Perhaps we could avoid maintaining ready/future queues as currently, but
286+
rather if transaction doesn't have all requirements satisfied by existing
287+
transactions we attempt to re-import it in the future.
288+
289+
1. Instead of maintaining a full pool with total ordering we attempt to maintain
290+
a set of next (couple of) blocks. We could introduce batch-validate runtime
291+
api method that pretty much attempts to simulate actual block inclusion of
292+
a set of such transactions (without necessarily fully running/dispatching
293+
them). Importing a transaction would consist of figuring out which next block
294+
this transaction have a chance to be included in and then attempting to
295+
either push it back or replace some of existing transactions.
296+
297+
1. Perhaps we could use some immutable graph structure to easily add/remove
298+
transactions. We need some traversal method that takes priority and
299+
reachability into account.
300+
301+
1. It was discussed in the past to use set reconciliation strategies instead of
302+
simply broadcasting all/some transactions to all/selected peers. An Ethereum's
303+
[EIP-2464](https://github.com/ethereum/EIPs/blob/5b9685bb9c7ba0f5f921e4d3f23504f7ef08d5b1/EIPS/eip-2464.md)
304+
might be a good first approach to reduce transaction gossip.
305+
306+
# Current implementation
307+
308+
Current implementation of the pool is a result of experiences from Ethereum's
309+
pool implementation, but also has some warts coming from the learning process of
310+
Substrate's generic nature and light client support.
311+
312+
The pool consists of basically two independent parts:
313+
314+
1. The transaction pool itself.
315+
2. Maintenance background task.
316+
317+
The pool is split into `ready` pool and `future` pool. The latter contains
318+
transactions that don't have their requirements satisfied, and the former holds
319+
transactions that can be used to build a graph of dependencies. Note that the
320+
graph is build ad-hoc during the traversal process (getting the `ready`
321+
iterator). This makes the importing process cheaper (we don't need to find the
322+
exact position in the queue or graph), but traversal process slower
323+
(logarithmic). However most of the time we will only need the beginning of the
324+
total ordering of transactions for block inclusion or network propagation, hence
325+
the decision.
326+
327+
The maintenance task is responsible for:
328+
329+
1. Periodically revalidating pool's transactions (revalidation queue).
330+
1. Handling block import notifications and doing pruning + re-importing of
331+
transactions from retracted blocks.
332+
1. Handling finality notifications and relaying that to transaction-specific
333+
listeners.
334+
335+
Additionally we maintain a list of recently included/rejected transactions
336+
(`PoolRotator`) to quickly reject transactions that are unlikely to be valid
337+
to limit number of runtime verification calls.
338+
339+
Each time a transaction is imported, we first verify it's validity and later
340+
find if the tags it `requires` can be satisfied by transactions already in
341+
`ready` pool. In case the transaction is imported to the `ready` pool we
342+
additionally *promote* transactions from `future` pool if the transaction
343+
happened to fulfill their requirements.
344+
Note we need to cater for cases where transaction might replace a already
345+
existing transaction in the pool. In such case we check the entire sub-tree of
346+
transactions that we are about to replace, compare their cumulative priority to
347+
determine which subtree to keep.
348+
349+
After a block is imported we kick-off pruning procedure. We first attempt to
350+
figure out what tags were satisfied by transaction in that block. For each block
351+
transaction we either call into runtime to get it's `ValidTransaction` object,
352+
or we check the pool if that transaction is already known to spare the runtime
353+
call. From this we gather full set of `provides` tags and perform pruning of
354+
`ready` pool based on that. Also we promote all transactions from `future` that
355+
have their tags satisfied.
356+
357+
In case we remove transactions that we are unsure if they were already included
358+
in current block or some block in the past, it is being added to revalidation
359+
queue and attempted to be re-imported by the background task in the future.
360+
361+
Runtime calls to verify transactions are performed from a separate (limited)
362+
thread pool to avoid interferring too much with other subsystems of the node. We
363+
definitely don't want to have all cores validating network transactions, cause
364+
all of these transactions need to be considered untrusted (potentially DoS).

0 commit comments

Comments
 (0)