Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simd improvement #1278

Merged
merged 1 commit into from
Mar 19, 2024
Merged

Simd improvement #1278

merged 1 commit into from
Mar 19, 2024

Conversation

laurentcau
Copy link

  • Add simd aligned_vec3 (and sse aligned_dvec3 - 2 x xmm)
  • Fast packed_vec3 <=> aligned_vec3 and packed_vec4 <=> aligned_vec4 conversion
  • Fast aligned_vec3 <=> aligned_vec4 conversion
  • Optimized aligned_mat x aligned_mat and aligned_mat x aligned_vec
  • Inverse aligned_mat3 simd version (actually slower than ssid on my computer even it has 30% less instruction ?)

@laurentcau
Copy link
Author

Note: I fixed the integer div issue reported there: #1255
I also added a test to check there will be no regression in the future.
For the template issues with GLM_FORCE_NEON, since there is no detail, I can't fix it.
All tests are compiling fine on my side with clang + neon. So that should be something not covered by tests but what ? @dimitre

@laurentcau laurentcau force-pushed the b7 branch 3 times, most recently from 5a553eb to 64038f9 Compare March 18, 2024 10:11
- Add simd aligned_vec3 (and sse aligned_dvec3 - 2 x xmm)
- Fast packed_vec3 <=> aligned_vec3 and packed_vec4 <=> aligned_vec4 conversion
- Fast aligned_vec3 <=> aligned_vec4 conversion
- Optimized aligned_mat x aligned_mat and aligned_mat x aligned_vec
- Inverse aligned_mat3 simd version (actually slower than ssid on my computer even it has 30% less instruction ?)
@laurentcau
Copy link
Author

@christophe-lunarg
Copy link

Considering you didn't receive any answer from the ARM NEON issue after multiple query, I think it's fair to break it and have been for who that matters submit a PR.

@christophe-lunarg
Copy link

Thanks for contributing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants