Improve the design of StreamSafeCUDAAllocator #38195

From00 · 2021-12-16T06:26:29Z

PR types

New features

PR changes

Others

Describe

This PR do the following things to improve the StreamSafeCUDAAllocator impletemented in #37290 :

Rewrites the StreamSafeCUDAAllocator to directly create and record an event when the RecordStream is called, not when the allocation is freed
Changes the default stream of StreamSafeCUDAAllocator from NULL stream to the main stream in CUDADeviceContext
Adds a GetStream interface for getting the owning stream of a StreamSafeCUDAAllocation

Why we no longer create and record events in a deferred manner?
When the main process finish, Paddle may clear some remaining tensors in shared memory (see #22541), and thus free allocations that may be recorded some CUDA streams. However, the stream created in executor has already destroyed at that time, so we should not try to record events to it. Similar problems may occur in any scenarios where StreamSafeCUDAAllocator is used, so we cannot assume that the recorded streams are still valid when freeing CUDA allocations.

paddle-bot-old · 2021-12-16T06:26:59Z

✅ This PR's description meets the template requirements!
Please wait for other CI results.

paddle-bot-old · 2021-12-16T06:27:01Z

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

zhiqiu

LGTM

Add GetStream Interface for StreamSafeCUDAAllocator

23e61d6

From00 changed the title ~~Add GetStream Interface for StreamSafeCUDAAllocator~~ Add GetStream Interface for StreamSafeCUDAAllocation Dec 16, 2021

From00 changed the title ~~Add GetStream Interface for StreamSafeCUDAAllocation~~ Improve the design of StreamSafeCUDAAllocator Dec 16, 2021

zhiqiu approved these changes Dec 17, 2021

View reviewed changes

zhiqiu merged commit b0d12d9 into PaddlePaddle:develop Dec 17, 2021

From00 deleted the add-GetStream-interface-for-StreamSafeCudaAllocator branch December 17, 2021 03:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve the design of StreamSafeCUDAAllocator #38195

Improve the design of StreamSafeCUDAAllocator #38195

From00 commented Dec 16, 2021 •

edited

Loading

paddle-bot-old bot commented Dec 16, 2021 •

edited

Loading

paddle-bot-old bot commented Dec 16, 2021

zhiqiu left a comment

Improve the design of StreamSafeCUDAAllocator #38195

Improve the design of StreamSafeCUDAAllocator #38195

Conversation

From00 commented Dec 16, 2021 • edited Loading

PR types

PR changes

Describe

paddle-bot-old bot commented Dec 16, 2021 • edited Loading

paddle-bot-old bot commented Dec 16, 2021

zhiqiu left a comment

Choose a reason for hiding this comment

From00 commented Dec 16, 2021 •

edited

Loading

paddle-bot-old bot commented Dec 16, 2021 •

edited

Loading