[Fix] Fix paddle.floor_divide
AssertionError when using CUDA 11.2
#45051
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR types
Bug fixes
PR changes
OPs
Describe
How to reproduce
In an environment with CUDA 11.2, the bug can be reproduced by executing the following code:
The running result contains assertion errors even if
b
is a non-zero number:Causes
This bug is caused by the calls to
std::trunc()
, as the functor gives expected results when these calls are removed.How I fixed it
My solution is to remove the calls to
std::trunc()
, since they are NOT necessary here. The reason is two-fold:paddle.floor_divide()
supports only int32 and int64 data, and rounding-to-zero is the defined behavior of integer division in C++.std::trunc()
can result in an extra type coercions when input arguments are of integral types, which brings overheads.It is noteworthy that the
floor_divide
implementation for integral types in PyTorch is also dependent on C++ integer division, and nostd::trunc()
is used.Also note that the name
paddle.floor_divide
is not so accurate. For divisions involving a negative number and a positive number, the result is rounded to zero instead of performing actual floor division. This PR does not change this behavior.