-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: Notify another worker if unparked from driver #6245
Conversation
No apparent performance regression seen running benches/rt_multi_threaded:
Platform:
|
if self.did_acquire_driver_lock { | ||
worker.handle.notify_parked_local(); | ||
self.did_acquire_driver_lock = false; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did_acquire_driver_lock
is checked every time in run_task()
after transition_from_searching()
so that notify_parked_local()
would not consider the current core as searching.
Hi, I see that you are trying to improve the Tokio runtime blocking problem, but it seems that we can use a simpler way to achieve this . Please reference #6251. If you have additional considerations, please let me know! |
@wathenjiang That's a rather neat solution. I don't know if my understanding is correct. So, unparking with tasks means that the worker unparked from driver, as wake by other worker would unpark with an empty local task queue + empty fifo slot. Though I have a question regarding the case where |
Motivation
This PR is related to One bad task can halt all executor progress forever.
If busy workers are blocked running tasks for the moment, and all other workers are parked on conditional variables, no worker is blocking on the IO driver. The system without utilizing all workers stalls unnecessarily.
Solution
This problem might be mitigated by attempting to wake some idle worker (if any) after unparking from the IO Driver right before running task.
Previously, if the current worker is the final searching worker,
transition_from_searching()
wakes another idle worker to steal tasks. When a worker wakes from the I/O driver with new events, it is not in asearching
state, sonotify_parked_local()
is not called inrun_task()
. The PR adds callingnotify_parked_local()
inrun_task()
if the worker wakes from the I/O driver.Add a flag
did_acquire_driver_lock
toCore
data to record whether during the previous tick the worker has locked the driver. Inrun_task()
before actually running the task, checkdid_acquire_driver_lock
aftertransition_from_searching
. If the flag istrue
, callnotify_parked_local()
to notify an idle worker if any (won't notify if there is some worker currently searching).The PR tries to mitigate the issue by trying to ensure that there is some worker running (searching) other than the current worker which might get unfortunately blocked.
Test Cases
1
Output:
2
Output: