-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add optional AVX512-FP16 arithmetic for the scalar quantizer. #4225
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com>
Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com>
@mulugetam do I get it correct that this PR introduces an optional tradeoff between the accuracy and the speed? |
Yes. |
Where does the loss in accuracy come from? because computation is performed as fp16 * fp16 instead of fp16-> fp32 then fp32 * fp32 ? |
yes, pure fp16 FMAD operations |
@alexanderguzhva I'm not sure why this PR fails some of the |
@mulugetam well, that's the most sad part in committing PRs, according to my experience. Will you be able to reproduce the problem if you try rebasing your PR on top of the head? Otherwise, I have no explanations: could be a different hardware on the CI machine, a compiler or maybe your code. |
@alexanderguzhva Yes, it's reproducible. If I |
PR #4025 introduced a new architecture mode,
avx512_spr
, which enables the use of features available since Intel® Sapphire Rapids. The Hamming Distance Optimization (PR #4020), based on this mode, is now used by OpenSearch to speed up the indexing and searching of binary vectors.This PR adds support for
AVX512-FP16
arithmetic for the Scalar Quantizer. It introduces a new Boolean flag,ENABLE_AVX512_FP16
, which, when used together with theavx512_spr
mode, explicitly enablesavx512fp16
arithmetic.Tests on an AWS r7i instance demonstrate up to a 1.6x speedup in execution time when using
AVX512-FP16
compared toAVX512
. The improvement comes from a reduction in path length.-DFAISS_OPT_LEVEL=avx512
:-DFAISS_ENABLE_AVX512_FP16=ON -DFAISS_OPT_LEVEL=avx512_spr