-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add graceful shutdown timer to GRPC frontend #7969
base: main
Are you sure you want to change the base?
Conversation
src/grpc/infer_handler.cc
Outdated
@@ -753,14 +753,20 @@ ModelInferHandler::Process( | |||
StartNewRequest(); | |||
} | |||
|
|||
if (ExecutePrecondition(state)) { | |||
if (accepting_new_conn_ && ExecutePrecondition(state)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a lock for accepting_new_conn_
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added lock: conn_mtx_
For both accepting_new_conn_
and cq_shutdown_
src/grpc/infer_handler.cc
Outdated
@@ -753,14 +753,20 @@ ModelInferHandler::Process( | |||
StartNewRequest(); | |||
} | |||
|
|||
if (ExecutePrecondition(state)) { | |||
if (accepting_new_conn_ && ExecutePrecondition(state)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alongwith reading requests and returning error to the client, I would advice we also prevent calling
service_->RequestModelInfer(
state->context_->ctx_.get(), &state->request_,
state->context_->responder_.get(), cq_, cq_, state);
in L671.
The idea is that we should avoid additional activity on the completion queue once the server shutdown is detected. We should just go through the inflight requests and drain their state objects before exiting the server to prevent any memory leak.
When the “Graceful shutdown function” is called, the server immediately notifies all clients to stop sending new RPCs. Then after the clients have received that notification, the server will stop accepting new RPCs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved the forceful shutdown until after the core unloads. This prevents responses from returning from the core once the forceful shutdown has initiated. Completion queues are unloaded only after the forceful shutdown is complete
… mwittwer/gRPC-shutdown-timer
#ifdef TRITON_ENABLE_GRPC | ||
if (g_grpc_service) { | ||
// Forceful shutdown of GRPC service |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question - why would we still need the forceful shutdown of GRPC while we have the graceful shutdown?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The graceful shutdown GracefulStop()
rejects any new GRPC requests but does not stop ongoing requests. It waits until the timeout completes and then returns but the GRPC server may still exist. Unloading the triton server core first prevents writing back to unloaded states in the GRPC server.
The forceful shutdown in Stop()
then cancels any remaining requests and stops the GRPC server if it hasn't stopped already.
…rence-server/server into mwittwer/gRPC-shutdown-timer
What does the PR do?
This PR adds a shutdown timer to the gRPC endpoint for both infer and streaming infer requests. Inflight requests will be allowed to complete before the timer expires and new requests made after shutdown has started will be rejected.
Checklist
<commit_type>: <Title>
Commit Type:
Check the conventional commit type
box here and add the label to the github PR.
Related PRs:
N/A
Where should the reviewer start?
Start with the new functions in src/grpc/grpc_server.cc and the flow in src/main.cc
GracefulStop();
unload the models in core
Stop();
Test plan:
Review L0_lifecycle and grpc related tests.
Updated shutdown behavior changes the tested responses for some lifecycle tests
Caveats:
The updated shutdown process includes three responses. The CANCELLED state cannot be removed as it is part of the GRPC endpoint shutdown behavior:
Background
Shutdown behavior for the gRPC endpoint could lead to unexpected errors when unloading the models from the core
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
N/A