You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Summary:
related: #2884
I added some SVE implementations of:
- `code_distance`
- `distance_single_code`
- `distance_four_codes`
- `exhaustive_L2sqr_blas_cmax_sve`
- `fvec_inner_products_ny`
- `fvec_madd`
## Evaluation result
I evaluated the search for SIFT1M dataset on AWS EC2 c7g.large and r8g.large instances.
`main` is the current (2e6551f) implementation.
### c7g.large (Graviton 3)


On Graviton 3, `IndexIVFPQ` has been improved particularly. In the best case (IndexIVFPQ + IndexFlatL2, M: 32), this PR is approx. 2.38-~~2.50~~**2.44**x faster than `main` .
- nprobe: 1, 0.069ms/query → 0.029ms/query
- nprobe: 4, 0.181ms/query → ~~0.074~~**0.075**ms/query
- nprobe: 16, 0.613ms/query → ~~0.245~~**0.251**ms/query
### r8g.large (Graviton 4)


On Graviton 4, especially `IndexIVFPQ` for tiny `nprobe` has been improved. In the best case (IndexIVFPQ + IndexFlatL2, M: 8, nprobe: 1), this PR is approx. 1.33x faster than `main` (0.016ms/query → 0.012ms/query).
Pull Request resolved: #3933
Reviewed By: mengdilin
Differential Revision: D64249808
Pulled By: asadoughi
fbshipit-source-id: 8a625f0ab37732d330192599c851f864350885c4
0 commit comments