Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Float literals - fix main primitives to use .f #57703

Merged
merged 1 commit into from
Feb 12, 2022

Conversation

lawnjelly
Copy link
Member

@lawnjelly lawnjelly commented Feb 6, 2022

Converts float literals from double format (e.g. 0.0) to float format (e.g. 0.0f) where appropriate for 32 bit calculations.

As discussed in the PR meeting, there was a desire for 2 PRs, this one, including changes to use 0.0f float format literals, and another one for casts to real_t etc.

As discussed, this covers the main classes in core/math which are most likely to be bottlenecks, rather than the whole codebase. The exception is BVH which makes more sense to do in the next sync with 3.x.

See #56396 for details.

Introduction

After some investigation using godbolt using clang, gcc, msvc, ARM and all major compilers, it turns out that our technique of using double literals in float calculations 0.0 is problematic because it causes promotion of the calculation to occur in 64 bit.

This involves:

  • Converting arguments from 32 to 64 bit
  • Calculation in 64 bit (potentially slower because the SIMD instruction is wider, less can be done at once)
  • Converting result back to 32 bit

This is considerably less efficient than simply doing the calculation in 32 bit in a single instruction, which is normally the intention. The difference this makes in terms of speed / power use etc will depend on the platform - some mobiles may be particularly sensitive. The binary may also be slightly smaller too, but probably not by a huge margin.

There appear to be 3 obvious ways of resolving this problem:

  • Specify the literal as a float e.g. 0.0f
  • Cast the value to float (or real_t depending on usage)
  • Use a macro to cast to e.g. real_t

Caveats

Although these calculations will be be faster, the result may in some cases be very slight smidgen different. Although the arguments and results are to all extents and purposes 32 bit in the before and after case, the calculation was done at a higher precision when 64 bit was (erroneously) used.

This is equivalent to the difference between the result of 80 bit floating point calculations converted to 64 bit on the old x86 instructions. Something to be aware of when attempting e.g. deterministic repeatability, but hopefully not hugely important in terms of result accuracy (although we might watch for any possible difference in e.g. physics). This would have been nice to have avoided by doing all the calculations at 32 bit from the get go, but it seems worth investigating this change.

@lawnjelly lawnjelly requested a review from a team as a code owner February 6, 2022 11:52
@Calinou
Copy link
Member

Calinou commented Feb 6, 2022

On a similar note, should using is_equal_approx() with a double and float (or a float and double) be allowed? Right now, you need to cast one of the values with real_t() to ensure it'll work consistently regardless of build flags.

@Calinou Calinou added this to the 4.0 milestone Feb 6, 2022
@Calinou Calinou added enhancement and removed bug labels Feb 6, 2022
@lawnjelly
Copy link
Member Author

On a similar note, should using is_equal_approx() with a double and float (or a float and double) be allowed? Right now, you need to cast one of the values with real_t() to ensure it'll work consistently regardless of build flags.

I'm personally ok with casting in this situation, as it makes it explicit. It did fox me the first time though. Don't know what others think about solutions for this.

@lawnjelly lawnjelly force-pushed the float_literals_math_funcs branch from 2f80f25 to 6c6dcba Compare February 6, 2022 15:39
@lawnjelly
Copy link
Member Author

lawnjelly commented Feb 6, 2022

Worth mentioning as well, an alternative approach to handling constants, rather than casting where used, it to define them as e.g.:

#define CMP_EPSILON ((real_t) 0.00001)
// instead of 
#define CMP_EPSILON 0.00001

However this won't work for some of the constants that have more significant figures e.g.:

#define Math_SQRT12 ((real_t) 0.7071067811865475244008443621048490)

because in some circumstances we may WANT to use them as doubles even when compiling with 32bit real_t (perhaps in calculations where we need more accuracy for instance), and casting them to float would lose the extra detail (even if we then cast this back to double) - I have checked this in compiler.

We could alternatively have defines for real_t and double, something like this:

#define Math_SQRT12 0.7071067811865475244008443621048490
#define Math_SQRT12_R ((real_t) Math_SQRT12)

or perhaps the opposite might be more practical, i.e. have the (real_t) version as the default, then have the double version available for special cases:

#define Math_SQRT12_D 0.7071067811865475244008443621048490
#define Math_SQRT12 ((real_t) Math_SQRT12_D)

I can change the PR to use this type of approach if it is preferred, there are advantages and disadvantages to each. Having (real_t) in the actual code is more verbose, but on the other hand it is more obvious what is going on to a reader, and might be more newbie friendly than being "clever" with the defines.

@lawnjelly
Copy link
Member Author

lawnjelly commented Feb 8, 2022

PR Meeting: Split into 2 PRs, one for .f and one for (real_t) casts, and do all the key math stuff in these PRs (vectors, math etc).

We won't bother with non-bottleneck stuff, and keep the standard approach as using double literal, and just fix up in future where necessary.

@lawnjelly lawnjelly force-pushed the float_literals_math_funcs branch from 6c6dcba to b39be40 Compare February 10, 2022 18:39
Converts float literals from double format (e.g. 0.0) to float format (e.g. 0.0f) where appropriate for 32 bit calculations.
@lawnjelly lawnjelly force-pushed the float_literals_math_funcs branch from b39be40 to 5298e16 Compare February 10, 2022 18:43
@lawnjelly lawnjelly changed the title Float literals - fix math_funcs Float literals - fix main primitives to use .f Feb 10, 2022
b < 0.04045 ? b * (1.0 / 12.92) : Math::pow((b + 0.055) * (1.0 / (1 + 0.055)), 2.4),
r < 0.04045f ? r * (1.0 / 12.92) : Math::pow((r + 0.055f) * (float)(1.0 / (1 + 0.055)), 2.4f),
g < 0.04045f ? g * (1.0 / 12.92) : Math::pow((g + 0.055f) * (float)(1.0 / (1 + 0.055)), 2.4f),
b < 0.04045f ? b * (1.0 / 12.92) : Math::pow((b + 0.055f) * (float)(1.0 / (1 + 0.055)), 2.4f),
a);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note the (float) casting is on purpose here rather than in a later PR - it needs to be done to prevent a warning using pow, otherwise the version of pow to call (float or double) is unclearly defined.

@lawnjelly
Copy link
Member Author

Just to note, this should be good to go now. 👍

Could do with someone else glancing through. It's pretty much a case of replacing with .f, except for special cases where double precision might be needed. I might have missed the odd conversion, but it's not crucial, we are just aiming to get most of those using 32 bit math when compiled real_t as float.

Copy link
Member

@akien-mga akien-mga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me (very cursory review).

@akien-mga akien-mga merged commit daf9729 into godotengine:master Feb 12, 2022
@akien-mga
Copy link
Member

Thanks!

@lawnjelly lawnjelly deleted the float_literals_math_funcs branch February 12, 2022 09:09
Copy link
Member

@aaronfranke aaronfranke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is already merged but I decided to look over it anyway and I give it a ✅

Everything in this PR falls into one of these 3 categories:

  • The float literal has an identical value to the double literal (such as sums of powers of two 0.5f or integers 0.0f, 1.0f, etc)
  • The value is something we don't need the exact value of (such as checking if a value is > 0.9999f)
  • The value is in Color, which should be always 32-bit anyway (since more precision is useless for Color)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants