Fix several ubsan reported misaligned accesses #100325

hpvb · 2024-12-12T16:23:57Z

These misaligned accesses are shown in all of our CI hooks. It turned out to not be difficult to fix.

It is likely that this will improve performance for aarch64.

akien-mga · 2024-12-12T16:31:13Z

drivers/vulkan/rendering_device_driver_vulkan.cpp

@@ -3470,6 +3470,7 @@ Vector<uint8_t> RenderingDeviceDriverVulkan::shader_compile_binary_from_spirv(Ve
 		offset += sizeof(uint32_t);
 		encode_uint32(sizeof(ShaderBinary::Data), binptr + offset);
 		offset += sizeof(uint32_t);
+		offset += sizeof(uint32_t); // Pad to align ShaderBinary::Data to 8 bytes.


To make sure we don't get regressions from ShaderBinary::Data changing size in the future, should we have some static_assert for the assumption we're making here / above / below?

Just mentioning this to flag it, maybe wait for @clayjohn's input before doing it, as I don't have the full picture of how all this stuff is aligned.

We could also just turn it into a real struct perhaps?

I'm also not entirely sure why the size is in there at all since it seems that sizeof(ShaderBinary::Data) is constant.

I don't have any strong opinions. CC @RandomShaper who knows this code better than I do

We could at least add some getters and setters to a structure with a flexible member so that the framing of the packet doesn't have to implemented twice in two different places? One where the packet gets created, and then again where it gets read?

I don't think that should stop us from merging this in my opinion. This fixes a bug, but leaves the hard to maintain code in place. If desired I can propose a PR to make it easier to maintain separately?

To make sure we don't get regressions from ShaderBinary::Data changing size in the future, should we have some static_assert for the assumption we're making here / above / below?

The actual size of ShaderBinary::Data is dynamically determined. And it changing size wouldn't change alignment of it. The problem was that we had it 12 bytes into the structure, but the ShaderBinary::Data structure has uint64_t vertex_input_mask = 0;. This means that the whole structure needs to be aligned to 8 bytes, not 4. That is what the extra padding is for.

Hi everyone! Regarding the changing of sizeof(ShaderBinary::Data) see my reply.

A static_assert would indeed make things safer, but ultimately the responsibility of guarding that is in the function being modified by this PR, because it's a versioned binary data. The PR forgot to bump ShaderBinary::VERSION.

Of course a static_assert would add an extra layer of reminders "hey! remember to bump ShaderBinary::VERSION!".

I really don't know what static assert would have caught this, or the version. The size of ShaderBinary::Data did not change. I added some extra padding to an entirely "free form" memory segment without any structure to it.

Maybe my suggestion was misguided, I didn't look much into it. My thinking was just that if we can say that adding 4 ensures padding to 8, we're making an assumption on what's the size up to that point (and if that size changes in the future, will the +4 padding still be correct?).

@akien-mga yeah, I get it. But due to the way this code works now I don't think there's any way to check it from the "outside" without some kind of unit test. I think it might be better to use packed structs for this and then we could in fact use some assertions to test this potentially.

darksylinc · 2024-12-12T21:59:08Z

Hi!

Four (minor) issues in your PR:

ShaderBinary::VERSION needs to be bumped from 4 to 5. The last time it was bumped was for 4.3 in 7e59864. Otherwise this change will cause crashes once it tries to load caches generated by older stable versions.
The file rendering_device_driver_d3d12.cpp contains extremely similar serialization code as Vulkan's and likely requires the exact same patches. Metal's code appears to do its own thing.
The routine _align_command() is a mismatch of signs, which is a nightmare of Undefined Behaviors (except on C++20) because conversion from unsigned -> signed is not always defined. I believe the following change would avoid issues:

static constexpr uint32_t _align_command(uint32_t in) {
	return (in + 8u - 1u) & ~(8u - 1u);
}

The routine name _align_command() raises the warning:

rendering_device_graph.cpp:38:27: Identifier '_align_command' is reserved because it starts with '_' at global scope

I don't think the _underscore is necessary considering it's a static function only visible to that compilation unit. Or just preprend with gd_

hpvb · 2024-12-12T22:20:22Z

Hi!

Four (minor) issues in your PR:

1. `ShaderBinary::VERSION` needs to be bumped from 4 to 5. The last time it was bumped was for 4.3 in [7e59864](https://github.com/godotengine/godot/commit/7e598642d22055bc3958963a34d6ba2b8c6e88cf). Otherwise this change will cause crashes once it tries to load caches generated by older stable versions.

Okay, that makes sense. Will do.

2. The file [rendering_device_driver_d3d12.cpp](https://github.com/godotengine/godot/blob/19e003bc08ac2fb14fc2434bde9530387afd994c/drivers/d3d12/rendering_device_driver_d3d12.cpp#L3623) contains extremely similar serialization code as Vulkan's and likely requires the exact same patches. Metal's code appears to do its own thing.

Only if the ShaderData for D3D12 / Metal has a uint64_t in it. I don't know if it does. If it does I'll make a separate PR

3. The routine `_align_command()` is a mismatch of signs, which is a nightmare of Undefined Behaviors (except on C++20) because conversion from unsigned -> signed is not always defined. I believe the following change would avoid issues:

static constexpr uint32_t _align_command(uint32_t in) {
	return (in + 8u - 1u) & ~(8u - 1u);
}

You are entirely right, I don't know what I was thinking.

4. The routine name `_align_command()` raises the warning:
rendering_device_graph.cpp:38:27: Identifier 'align_command' is reserved because it starts with '' at global scope

I don't think the _underscore is necessary considering it's a static function only visible to that compilation unit. Or just preprend with gd_

Godot uses _some_function for static local functions as a matter of policy. I have had discussions about not using things starting with _ before, but we decided it was a minor issue and we'd just keep doing it.

RandomShaper · 2024-12-13T09:04:39Z

My thoughts:

I'd suggest adding an encode_uint32(0); before the new offset += sizeof(uint32_t); // Pad to align ShaderBinary::Data to 8 bytes., so the padding value is deterministic.
There's an STEPIFY macro for the same purpose as the newly added _align_command(), but maybe it's not reachable from where this PR needs it. It may be good to put it in some maths or misc functions include. I think other parts of the codebase have their own, so a global fusing of all would be nice.
Regarding the leading underscore, I'd also favor this PR following such convention as it's our current standard. If anything, we should maybe silence the warning (which compiler/linter is raising it?).
Regarding turning all the shader binary (de)serialization code from it's current overly Spartan shape into a more manageable one (in a future PR), yes, please! We're indeed making this more complex and error-prone than it should be.

RandomShaper

All my thoughts are addressable as follow-ups, so this indeed looks good in its current state given its intended scope.

EDIT: Oh, well, my point one in the comment above should be addressed now, but it's not critical.

hpvb · 2024-12-13T11:40:02Z

EDIT: Oh, well, my point one in the comment above should be addressed now, but it's not critical.

I'll zero the padding!

EDIT: all done, this is ready to merge then I think.

akien-mga · 2024-12-13T11:52:37Z

There's an STEPIFY macro for the same purpose as the newly added _align_command(), but maybe it's not reachable from where this PR needs it. It may be good to put it in some maths or misc functions include. I think other parts of the codebase have their own, so a global fusing of all would be nice.

It should be accessible here actually, it's defined in rendering_device_commons.h which is included by way of rendering_device_graph.h. The formula is slightly different but the results might be equivalent? Didn't do the math.

These misaligned accesses are shown in all of our CI hooks. It turned out to not be difficult to fix. It is likely that this will improve performance for aarch64.

hpvb · 2024-12-13T14:33:30Z

I remove the _align_command() function and replaced it with STEPIFY

Repiteo · 2024-12-13T22:23:17Z

Thanks!

hpvb added bug topic:rendering labels Dec 12, 2024

hpvb added this to the 4.4 milestone Dec 12, 2024

hpvb requested a review from a team as a code owner December 12, 2024 16:23

akien-mga reviewed Dec 12, 2024

View reviewed changes

hpvb force-pushed the fix-rendering-alignment branch 2 times, most recently from 8b8e68d to 2d10e68 Compare December 12, 2024 22:29

RandomShaper approved these changes Dec 13, 2024

View reviewed changes

hpvb force-pushed the fix-rendering-alignment branch from 2d10e68 to 281bd0e Compare December 13, 2024 11:47

Fix several ubsan reported misaligned accesses

e674379

These misaligned accesses are shown in all of our CI hooks. It turned out to not be difficult to fix. It is likely that this will improve performance for aarch64.

hpvb force-pushed the fix-rendering-alignment branch from 281bd0e to e674379 Compare December 13, 2024 14:32

akien-mga approved these changes Dec 13, 2024

View reviewed changes

hpvb mentioned this pull request Dec 13, 2024

Fix ubsan reported errors in rendering #100374

Merged

Repiteo merged commit f2c8f17 into godotengine:master Dec 13, 2024
20 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix several ubsan reported misaligned accesses #100325

Fix several ubsan reported misaligned accesses #100325

hpvb commented Dec 12, 2024

akien-mga Dec 12, 2024

hpvb Dec 12, 2024

hpvb Dec 12, 2024

clayjohn Dec 12, 2024

hpvb Dec 12, 2024

hpvb Dec 12, 2024 •

edited

Loading

darksylinc Dec 12, 2024

hpvb Dec 12, 2024

akien-mga Dec 12, 2024

hpvb Dec 12, 2024

darksylinc commented Dec 12, 2024 •

edited

Loading

hpvb commented Dec 12, 2024

RandomShaper commented Dec 13, 2024 •

edited

Loading

RandomShaper left a comment •

edited

Loading

hpvb commented Dec 13, 2024 •

edited

Loading

akien-mga commented Dec 13, 2024 •

edited

Loading

hpvb commented Dec 13, 2024

Repiteo commented Dec 13, 2024

Fix several ubsan reported misaligned accesses #100325

Fix several ubsan reported misaligned accesses #100325

Conversation

hpvb commented Dec 12, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hpvb Dec 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

darksylinc commented Dec 12, 2024 • edited Loading

hpvb commented Dec 12, 2024

RandomShaper commented Dec 13, 2024 • edited Loading

RandomShaper left a comment • edited Loading

Choose a reason for hiding this comment

hpvb commented Dec 13, 2024 • edited Loading

akien-mga commented Dec 13, 2024 • edited Loading

hpvb commented Dec 13, 2024

Repiteo commented Dec 13, 2024

hpvb Dec 12, 2024 •

edited

Loading

darksylinc commented Dec 12, 2024 •

edited

Loading

RandomShaper commented Dec 13, 2024 •

edited

Loading

RandomShaper left a comment •

edited

Loading

hpvb commented Dec 13, 2024 •

edited

Loading

akien-mga commented Dec 13, 2024 •

edited

Loading