-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize gemv_n_sve kernel #5157
base: develop
Are you sure you want to change the base?
Conversation
Can we also update the plot s.t:
|
yes, an explanation of the benchmark graph would be nice - I'm not sure I understand why your PR would result in two very narrow regions where there is more than tenfold speedup ? |
Please refer to the updated graph. For the regions where we see spikes, The baseline performance of these problem sizes are poor and that's why we see spike in speedup. |
Please see the updated graph. |
@martin-frbg could you please re-review? |
Loop-unrolling of sgemv_n kernel is implemented with svmla_lane, along with a main loop and a tail loop.
The graph below shows performance improvement for OMP_NUM_THREADS=1 on Graviton-3: