Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use math builtins instead of <math.h> if available #8467

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

vittorioromeo
Copy link

Approaching compile-time micro-optimization territory here, but this reliably shaves off 40-50ms of build time in any TU that includes imgui_internal.h on my system. As an added benefit, we also avoid exposing math.h to those TUs, any potentially improve debug codegen a little bit.

Both GCC and Clang on my system have all the required builtins. I don't use MSVC.

If the PR is deemed worthwhile, I will clean-up the PR to avoid the code repetition and improve formatting. Otherwise, feel free to close -- no hard feelings!

@ocornut
Copy link
Owner

ocornut commented Mar 7, 2025

Thanks for your PR!

It's interesting to know that this is possible. I'm slightly interested in potentially it affecting Debug Codegen, more than compile time since imgui_internal.h is not particularly often included in user's codebases.

My gut intuition is that the slight visible cruft/bloat and potential maintenance may rub me off, but I'll look at this more closely at some point. (Don't expect it to be soon as I'm in a permanent state of death by thousands cuts with issues/pr, if I treat all the very peripheral stuff like that I'm never moving forward with bigger stuff).

@vittorioromeo
Copy link
Author

No worries, let me know if you want me to take a stab at cleaning it up to minimize the cruft/bloat! :)

@kleisauke
Copy link

Wouldn't -fno-math-errno achieve the same effect? That is, speed things up by allowing the compiler to optimize math library functions to inline code or intrinsics - see for example:
https://llvm.org/docs/Vectorizers.html#vectorization-of-function-calls
https://ryanburn.com/2020/12/26/why-c-standard-math-functions-are-slow/

@vittorioromeo
Copy link
Author

Wouldn't -fno-math-errno achieve the same effect? That is, speed things up by allowing the compiler to optimize math library functions to inline code or intrinsics - see for example: https://llvm.org/docs/Vectorizers.html#vectorization-of-function-calls https://ryanburn.com/2020/12/26/why-c-standard-math-functions-are-slow/

Nope, even with -fno-math-errno, a call instruction is generated without optimizations enabled. See https://gcc.godbolt.org/z/anbzjvzc9

@kleisauke
Copy link

Nope, even with -fno-math-errno, a call instruction is generated without optimizations enabled. See https://gcc.godbolt.org/z/anbzjvzc9

Hmm, that's with #include <cmath> and std::fabs(). However, with #include <math.h>, fabsf() is identical to __builtin_fabsf() when compiling without optimizations on both Clang and GCC - see for example:
https://gcc.godbolt.org/z/Khe7z5Kxs

(the only difference on GCC is between ceilf and __builtin_ceilf)

@vittorioromeo
Copy link
Author

Nope, even with -fno-math-errno, a call instruction is generated without optimizations enabled. See https://gcc.godbolt.org/z/anbzjvzc9

Hmm, that's with #include <cmath> and std::fabs(). However, with #include <math.h>, fabsf() is identical to __builtin_fabsf() when compiling without optimizations on both Clang and GCC - see for example: https://gcc.godbolt.org/z/Khe7z5Kxs

(the only difference on GCC is between ceilf and __builtin_ceilf)

My bad, sorry! I didn't expect that choosing between std::fabsf and std::fabs would have made a difference in codegen, but I should have checked. Note that <cmath> is also fine, the call seems to be generated due to the overload API being used.

I wonder if this is an optimization applied in the compiler frontend as a special case, or if something else is going on...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants