You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
zfex (also on PyPI) is a performance-focused fork of zfec exploiting SIMD for both, Intel and ARM, with impressive results:
Legacy zfec had both results slightly above 50 MB/sec. zfex in all cases ran faster, achieving best performance with -DZFEX_UNROLL_ADDMUL_SIMD=4 unrolling, giving almost 6-fold speed-up.
Having an automatic and seamless way to pick faster versions of the algorithm if the hardware capabilities are available would be really neat.
The text was updated successfully, but these errors were encountered:
It is fantastic that zfex exists and zfex contributors have been able to achieve all those performance gains. It however comes at a cost of some additional complexity and a bigger maintenance overhead. Take a look at https://github.com/WojciechMigda/zfex/blob/main/zfex/zfex.c, for example, and decide for yourself if you really want to add those changes to zfec.
The zfex fork is a result of our refusal to merge hand-rolled assembly to zfec: see #71.
I think there is room for both zfec and zfex in the world: zfex can courageously make all the performance improvements that they can, and zfec can remain simpler and more conservative. There is value in both approaches.
The zfex fork is a result of our refusal to merge hand-rolled assembly to zfec: see #71.
Well, to be really accurate, it was our inability (mine really, and other folks' unavailability) to review the code, not outright refusal, that caused the fork. :-)
zfex (also on PyPI) is a performance-focused fork of zfec exploiting SIMD for both, Intel and ARM, with impressive results:
Having an automatic and seamless way to pick faster versions of the algorithm if the hardware capabilities are available would be really neat.
The text was updated successfully, but these errors were encountered: