-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: structured concurrency #2596
Comments
To me, the most fundamental idea in structured concurrency is that functions should work as abstractions: you should be able to tell by looking at a call site whether a function can spawn work that outlives the call, without inspecting the body of the function. Having implicit scopes breaks this property. This design still gives stronger guarantees than the traditional approach where you spawn tasks with unbounded lifetimes all over the place. But it's still much weaker than if you removed implicit scopes entirely. There's a lot here that reminds me of ZIO, so I'll link to this discussion of ZIO on the structured concurrency forum instead of repeating it :-) https://trio.discourse.group/t/zio-scala-library/72 |
@njsmith Thanks for commenting on this, Nathaniel! I read through the Zio thread (as well as your posts on structured concurrency 🙂), but I'm unsure how implicit scopes (in this proposal, at least) break the abstractions offered by structured concurrency. For instance, the implicit scope created by task::scope(async {
let join_handle = tokio::spawn(async {
// do work
});
}).await; ...shouldn't be able to outlive |
It can't outlive fn looks_innocent() {
tokio::spawn(...);
}
// then elsewhere...
async fn some_other_fn() {
looks_innocent();
// is looks_innocent still running? how can I know?
} The basic unit of abstraction/encapsulation is a function call/stack frame. A task is a larger unit, composed of multiple stack frames. So if you have an implicit task-global scope, that can leak child tasks across different functions within the same parent task. |
Thanks for the reply. I agree with you that having To reuse your example, I would assume that you would want to do this: async fn looks_innocent() {
task::scope(async {
tokio::spawn(...);
}).await;
}
// then elsewhere...
async fn some_other_fn() {
looks_innocent().await
} The problem is that async Rust is heavily geared towards using drop to cancel, but drop must happen immediately (i.e. not block and not be async). Imagine you do the following: tokio::select! {
_ => looks_innocent().await => { ... }
_ => time::delay_for(ms(1)) => { ... }
}
// No way to guarantee all tasks spawned by `looks_innocent()` completes if the delay elapses first, the async block running @Matthias247 approached this problem by having The panic on drop strategy leads to a separate set of problems.
If code inside of a scope panics, the scope must be immediately torn down. Since we are panicking, the
tl;dr, I really wish there was a clean way to introduce |
Can the tasks not be explicitly passed back from Something like: let tasks = tokio::select! {
_ => looks_innocent().await => { ... }
_ => time::delay_for(ms(1)) => { ... }
}
tasks.async_drop().await? With each task implementing the Of course, I've probably missed something fundamental in my assertions, in which case I apologise. |
@hgomersall |
@davidbarsky the point was less about some hypothetical language feature and more to posit how an explicit library call at the correct location might allow behaviour different to simply panicking. Edit: I was pondering the issue raised about needing to be compatible with the existing feature set. |
@hgomersall |
@carllerche sure, it wasn't meant to be a solution as such, more thinking around whether forcing the user to do things explicitly would be a valid way to complement the existing API. Can structured concurrency be added on to what exists with some slightly warty API additions...? |
There currently is no obvious way forward. As @Matthias247 pointed out, one should be able to establish scopes at any point. However, there is no way to enforce the scope without blocking the thread. The best we can do towards enforcing the scope is to panic when used "incorrectly". This is the strategy @Matthias247 has taken in his PRs. However, dropping adhoc is currently a key async rust pattern. I think this prohibits pancing when dropping a scope that isn't 100% complete. If we do this, using a scope within a We are at an impasse. Maybe if AsyncDrop lands in Rust then we can investigate this again. Until then, we have no way forward, so I will close this. It is definitely an unfortunate outcome. |
Hi, i am sorry to post here as this is probably just noise but this feature would be really useful and i hope that maybe a 'simple' case version can be developed to support a case such as this:
What you see here is a parent function which is given a list of files which it then parses and forms an iterator which filter_maps file contents in a Vec structure. This iterator is then cloned() to two independent async function which read each vector and write these contents to databases. The problem with this is that the typical tokio spawn method requires self to be static, as well as the other arguments to the parent function. However this isn't needed at all because both futures are joined before the scope exists and so it's not possible for these tasks to reference stale values as the scope lives longer than the futures! I admit i don't understand all the considerations of this feature that may be blocking a general solution, but a solution which fixes this simpler case would be really amazing as I cannot figure out to express something which is, as the literature calls it, 'embarrassingly parallel'. |
@stevemk14ebr Even with the changes that were proposed here, the |
i was able to sort of solve this in a weird way using
Note that this requires zero moves or lifetime markings, it just works as i would expect. |
That blocks the thread running the |
oh, well that's unfortunate :/ anyways thanks i hope something can be done eventually |
Tokio - and Rust's async model in general - is pretty freaking cool, but it isn't a perfect fit for everything. After hammering for a few days, I'm pretty confident that it's not working out here: - There's no way to enforce scoped async tasks without blocking the current thread.[1][2][3] This means that there's no async task equivalent to Rayon/Crossbeam-like scopes, and you're back to arcs and pins and all sorts of fun boilerplate if you'd like to foster parallelism with task::spawn(). - Traits and recursive calls need lots o' boxes, implemented by proc macros at best and by hand at worst. - Since many FS syscalls block, tokio asyncifies them by wrapping each in a spawn_blocking(), which spawns a dedicated thread. Of course you can wrap chunks of synchronous file I/O in spawn_blocking() if kicking off a separate thread for each File::open() call doesn't sound fun, but that means you can't interact with async code anywhere inside... - Add in the usual sprinkling of async/await/join/etc. throughout the code base - since anything that awaits needs to be a future itself, async code has a habit of bubbling up the stack. None of these are dealbreakers by themselves, but they can add up to real overheads. Not just cognitive, but performance too, especially if you've already got a design with concurrent tasks that do a decent job of saturating I/O and farming CPU-heavy work out to a thread pool. < gestures around > [1]: tokio-rs/tokio#1879 [2]: tokio-rs/tokio#2596 [3]: https://docs.rs/async-scoped/latest/async_scoped/struct.Scope.html#safety-3
This proposal is mostly the result of @Matthias247's work (#1879, #2579, #2576).
I took this work, summarized the API and added a few tweaks as proposals.
Summary
A specific proposal for structured concurrency. The background info and
motivation have already been discussed in #1879, so this will be skipped here.
There are also two PRs with specific proposals: #2579, #2576.
This RFC build upon this work.
Also related: #2576
The RFC proposes breaking changes for 0.3. "Initial steps" proposes a way to add
the functionality without breaking changes in 0.2.
Requirements
At a high level, structured concurrency requires that:
Result
and panic) goes unhandled.Again, see #2579 for more details.
Proposed Changes
task::scope
.task::signalled().await
waits until the task is signalled, indicating itshould gracefully terminate.
JoinHandle
forcefully cancels the task on drop.JoinHandle
gains the following methods:background(self)
: run the task in the background until the scope drops.try_background(self)
: run the task in the background until the scopedrops. If the task completes with
Err
, forcefully cancel the owning scope.signal(&self)
signals to the task to gracefully terminate.Terminology
forceful cancellation: The runtime drops the spawned asap without providing
an ability to gracefully complete. All cleanup is done synchronously in drop
implementations.
graceful cancellation: Signal to a task it should stop processing and clean
up any resources itself.
Details
Scopes
All spawned tasks must be spawned within a scope. The task is bound to the
scope.
This creates a new scope and executes the provided block within the context of
this scope. All calls to
tokio::spawn
link the task to the current scope. Whenthe block provided to
scope
completes, any spawned tasks that have not yetcompleted are forcefully cancelled. Once all tasks are cancelled, the call to
task::scope(...).await
completes.Scopes do not attempt to handle graceful cancellation. Graceful cancellation
is provided by a separate set of utilities described below.
All tasks come with an implicit scope. In other words, spawning a task is
equivalent to:
There could also be a global scope that is used as a catch-all for tasks that
need to run in the background without being tied to a specific scope. THis
global scope would be the "runtime global" scope.
Error propagation
As @Matthias247 pointed out in #1879, when using
JoinHandle
values, errorpropagation is naturally handled using Rust's
?
operator:In this case, if
t1
completes withErr
,t1.await?
will return from theasync { ... }
block passed totask::scope
. Once this happens, alloutstanding tasks are forcibly cancelled, resulting in no tasks leaking.
Dropping
JoinHandle
forcefully cancel the taskSub task errors must be handled. This requires avoiding to drop them without
processing them. To do this, the return value of the task must be processed
somehow. This is done by using the
JoinHandle
returned fromtokio::spawn
. In order to ensure the return value is handled, droppingJoinHandle
results in the associated task being forcefully canceled.JoinHandle
is also annotated with#[must_use]
.However, there are cases in which the caller does not need the return value. For
example, take the
TcpListener
accept loop pattern:In this case, the caller does not need to track each
JoinHandle
. To handlethese cases,
JoinHandle
gains two new functions (naming TBD):In the first case, the type signature indicates the task can only fail due to a
panic. In the second, the task may fail due to
Err
being returned by the task.When a task that is moved to the scope "background" fails due to a panic or
Err
return, the scope is forcibly canceled, resulting in any outstanding taskassociated with the scope to be forcibly canceled.
Now, the
TcpListener
accept loop becomes:Dropping a JoinHandle does not mean the task is terminated
An important detail to note, dropping
JoinHandle
will result in taskcancellation ASAP, but asynchronously. This means that there is no guarantee
that, when
JoinHandle
is dropped, the associated task has terminated. Instead,the guarantee is that the task will be terminated when the scope that spawned
the task.
For example, the following is not guaranteed to work:
Instead, to make this work, the join handle would need to be awaited:
This makes things a bit tricky when paired with async constructs that leverage
drop for cancellation. For example, this snippet would not work:
Again, this is because the only guarantee is that sub-tasks are terminated when
the containing scope (task that spawned) terminates. This code snippet requires
the ability to guarantee the task is completed mid scope. The only way to do
this is
.await
on theJoinHandle
.Nesting scopes
Usually, structured concurrency descripes a "tree" of tasks. Scopes have N
associated tasks. Each one of those tasks may have M sub tasks, etc. When an
error happens at any level, all decendent tasks are canceled and all ancestors
are canceled until the error is handled.
Instead of explicitly linking scopes, Rust's ownership system is used. A
scope is nested by virtue of the return value of
task::scope
(Scope
) being held in atask owned by a parent scope. If a
Scope
is dropped it forcefully cancels allassociated sub tasks. This, in turn, results in the sub tasks being dropped and
any
Scope
values held by the sub task to be dropped.However,
Scope::drop
must be synchronous yet cancelling a task and waitingfor all sub tasks to drop is asynchronous. This is handled has follows:
Every
Scope
has a "wait set" which is the set of tasks that the scope needs towait on when it is
.await
edWhen
Scope
is dropped:By doing this,
containing_scope.await
does not complete until all descendenttasks are completely terminated.
If a scope block terminates early due to an unhandled sub task error,
task::scope(async { ... }).await
completes with an error. In this case, thetask either handles the error or completes the task with an error. In the latter
case, the task's containing scope will receive the error and the process
repeats up the task hierarchy until the error is handled.
spawn_blocking
In order to maintain structured concurrency,
spawn_blocking
would need tobehave the same as
spawn
. The parent spawner would not be able to completeuntil all
spawn_blocking
tasks complete. Unfortunately, as it is not possibleto forcibly cancel blocking tasks, this could result in the parent task taking
an indefinite amount of time before completing.
This limitation will have to be documented with
spawn_blocking
. Ideally, wecould come up with mechanimsms to limit this impact.
Graceful task cancellation
Graceful task cancellation requires "signalling" a task, then allowing the task
to terminate on its own time. To do this,
JoinHandle
gains a new function,signal
. This sends a signal to the task. While this signal could be used forany purpose, it is implied to indicate graceful cancellation.
From the spawned task, the signal is received using:
task::signalled().await
.Putting it all together:
Initial steps
The majority of this work can be done without any breaking changes. As an
initial step, all behavior in Tokio 0.2 would remain as is. Tasks would not
get an implicit scope. Instead,
task::scope(...)
must always be called.Secondly, there would be
tokio::spawn_scoped(...)
which would spawn taskswithin the current scope. In 0.3,
spawn_scoped
would becometokio::spawn
.The text was updated successfully, but these errors were encountered: