-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add IsMean template parameter for compile #57558
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
80efc23
to
6bc8f48
Compare
841e553
to
65b5dcd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR描述加一下,本PR主要有什么修改,为什么能减少包体积。
{0, 1}, | ||
stream); | ||
phi::SumKernel<T>( | ||
dev_ctx_, *d_output, {0, 1}, d_output->dtype(), false, d_bias); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
第4个参数应该传输出的dtype
,虽然大部分情况下输入、输出的dtype
一样,但还是按照参数的语义来传比较好
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -14,14 +14,14 @@ limitations under the License. */ | |||
|
|||
#pragma once | |||
|
|||
#include "paddle/fluid/operators/reduce_ops/reduce_op.cu.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fluid
实际上只有这个文件用到了reduce_op.cu.h
,后面可以提个PR移除reduce_op.cu.h
文件。- 已经存在
paddle/phi/kernels/fusion/gpu/attn_gemm.h
,fluid下面的这个头文件是不是可以删除了?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是的,这部分会在下个PR 统一修改,避免功能修改混乱
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
{0, 1, 2}, | ||
stream); | ||
phi::SumKernel<T>( | ||
dev_ctx_, *d_output, {0, 2}, d_output->dtype(), false, d_bias); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
左边删除代码中的reduce_dim
是{0, 1, 2}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
class ReduceOp, | ||
typename TransformOp, | ||
bool IsMean = false> | ||
struct CubTensorReduce; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
false
的实现(L1012 - L1027)可以合并到这里,不用单独特化,只需特化true
的实现。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
73dff53
to
dd49e9a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. 我之前有一点不确定的是,这减少5M里面,有多少是修改IsMean
模板带来的,有多少是attn_gemm.h
里面将ReduceKernel
调用改成phi::ReduceSum
带来的。不过引用了attn_gemm.h
的算子似乎都不在phi目录下,所以应该对libphi.a没有贡献。
{0, 1}, | ||
stream); | ||
phi::SumKernel<T, phi::GPUContext>( | ||
dev_ctx_, *d_output, {0, 1}, d_output->dtype(), false, d_bias); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里第4个参数out_dtype
,应该是d_bias
的dtype
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
之前体积的统计是在没修改sumKernel的情况下统计的
PR types
Others
PR changes
Others
Description
Pcard-70459
本PR主要修改是将reduce_function.h中关于cubReduce的调用从is_mean普通参数判断调整为模板参数判断。
只有在ReduceMean OP 中才能将模板参数中的IsMean设置为True, 其他情况默认是False
本PR的修改原理,避免非ReduceMeanOP对于DivFunctor的编译例化,导致CubReduce的相关编译体积过大。
libphi.a的体积减少为: 5M : 872->867