Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix several ubsan reported misaligned accesses #100325

Merged
merged 1 commit into from
Dec 13, 2024

Conversation

hpvb
Copy link
Member

@hpvb hpvb commented Dec 12, 2024

These misaligned accesses are shown in all of our CI hooks. It turned out to not be difficult to fix.

It is likely that this will improve performance for aarch64.

@hpvb hpvb added this to the 4.4 milestone Dec 12, 2024
@hpvb hpvb requested a review from a team as a code owner December 12, 2024 16:23
@@ -3470,6 +3470,7 @@ Vector<uint8_t> RenderingDeviceDriverVulkan::shader_compile_binary_from_spirv(Ve
offset += sizeof(uint32_t);
encode_uint32(sizeof(ShaderBinary::Data), binptr + offset);
offset += sizeof(uint32_t);
offset += sizeof(uint32_t); // Pad to align ShaderBinary::Data to 8 bytes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make sure we don't get regressions from ShaderBinary::Data changing size in the future, should we have some static_assert for the assumption we're making here / above / below?

Just mentioning this to flag it, maybe wait for @clayjohn's input before doing it, as I don't have the full picture of how all this stuff is aligned.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also just turn it into a real struct perhaps?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also not entirely sure why the size is in there at all since it seems that sizeof(ShaderBinary::Data) is constant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have any strong opinions. CC @RandomShaper who knows this code better than I do

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could at least add some getters and setters to a structure with a flexible member so that the framing of the packet doesn't have to implemented twice in two different places? One where the packet gets created, and then again where it gets read?

I don't think that should stop us from merging this in my opinion. This fixes a bug, but leaves the hard to maintain code in place. If desired I can propose a PR to make it easier to maintain separately?

Copy link
Member Author

@hpvb hpvb Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make sure we don't get regressions from ShaderBinary::Data changing size in the future, should we have some static_assert for the assumption we're making here / above / below?

The actual size of ShaderBinary::Data is dynamically determined. And it changing size wouldn't change alignment of it. The problem was that we had it 12 bytes into the structure, but the ShaderBinary::Data structure has uint64_t vertex_input_mask = 0;. This means that the whole structure needs to be aligned to 8 bytes, not 4. That is what the extra padding is for.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi everyone! Regarding the changing of sizeof(ShaderBinary::Data) see my reply.

A static_assert would indeed make things safer, but ultimately the responsibility of guarding that is in the function being modified by this PR, because it's a versioned binary data. The PR forgot to bump ShaderBinary::VERSION.

Of course a static_assert would add an extra layer of reminders "hey! remember to bump ShaderBinary::VERSION!".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't know what static assert would have caught this, or the version. The size of ShaderBinary::Data did not change. I added some extra padding to an entirely "free form" memory segment without any structure to it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe my suggestion was misguided, I didn't look much into it. My thinking was just that if we can say that adding 4 ensures padding to 8, we're making an assumption on what's the size up to that point (and if that size changes in the future, will the +4 padding still be correct?).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akien-mga yeah, I get it. But due to the way this code works now I don't think there's any way to check it from the "outside" without some kind of unit test. I think it might be better to use packed structs for this and then we could in fact use some assertions to test this potentially.

@darksylinc
Copy link
Contributor

darksylinc commented Dec 12, 2024

Hi!

Four (minor) issues in your PR:

  1. ShaderBinary::VERSION needs to be bumped from 4 to 5. The last time it was bumped was for 4.3 in 7e59864. Otherwise this change will cause crashes once it tries to load caches generated by older stable versions.
  2. The file rendering_device_driver_d3d12.cpp contains extremely similar serialization code as Vulkan's and likely requires the exact same patches. Metal's code appears to do its own thing.
  3. The routine _align_command() is a mismatch of signs, which is a nightmare of Undefined Behaviors (except on C++20) because conversion from unsigned -> signed is not always defined. I believe the following change would avoid issues:
static constexpr uint32_t _align_command(uint32_t in) {
	return (in + 8u - 1u) & ~(8u - 1u);
}
  1. The routine name _align_command() raises the warning:

rendering_device_graph.cpp:38:27: Identifier '_align_command' is reserved because it starts with '_' at global scope

I don't think the _underscore is necessary considering it's a static function only visible to that compilation unit. Or just preprend with gd_

@hpvb
Copy link
Member Author

hpvb commented Dec 12, 2024

Hi!

Four (minor) issues in your PR:

1. `ShaderBinary::VERSION` needs to be bumped from 4 to 5. The last time it was bumped was for 4.3 in [7e59864](https://github.com/godotengine/godot/commit/7e598642d22055bc3958963a34d6ba2b8c6e88cf). Otherwise this change will cause crashes once it tries to load caches generated by older stable versions.

Okay, that makes sense. Will do.

2. The file [rendering_device_driver_d3d12.cpp](https://github.com/godotengine/godot/blob/19e003bc08ac2fb14fc2434bde9530387afd994c/drivers/d3d12/rendering_device_driver_d3d12.cpp#L3623) contains extremely similar serialization code as Vulkan's and likely requires the exact same patches. Metal's code appears to do its own thing.

Only if the ShaderData for D3D12 / Metal has a uint64_t in it. I don't know if it does. If it does I'll make a separate PR

3. The routine `_align_command()` is a mismatch of signs, which is a nightmare of Undefined Behaviors (except on C++20) because conversion from unsigned -> signed is not always defined. I believe the following change would avoid issues:
static constexpr uint32_t _align_command(uint32_t in) {
	return (in + 8u - 1u) & ~(8u - 1u);
}

You are entirely right, I don't know what I was thinking.

4. The routine name `_align_command()` raises the warning:

rendering_device_graph.cpp:38:27: Identifier 'align_command' is reserved because it starts with '' at global scope

I don't think the _underscore is necessary considering it's a static function only visible to that compilation unit. Or just preprend with gd_

Godot uses _some_function for static local functions as a matter of policy. I have had discussions about not using things starting with _ before, but we decided it was a minor issue and we'd just keep doing it.

@hpvb hpvb force-pushed the fix-rendering-alignment branch 2 times, most recently from 8b8e68d to 2d10e68 Compare December 12, 2024 22:29
@RandomShaper
Copy link
Member

RandomShaper commented Dec 13, 2024

My thoughts:

  1. I'd suggest adding an encode_uint32(0); before the new offset += sizeof(uint32_t); // Pad to align ShaderBinary::Data to 8 bytes., so the padding value is deterministic.
  2. There's an STEPIFY macro for the same purpose as the newly added _align_command(), but maybe it's not reachable from where this PR needs it. It may be good to put it in some maths or misc functions include. I think other parts of the codebase have their own, so a global fusing of all would be nice.
  3. Regarding the leading underscore, I'd also favor this PR following such convention as it's our current standard. If anything, we should maybe silence the warning (which compiler/linter is raising it?).
  4. Regarding turning all the shader binary (de)serialization code from it's current overly Spartan shape into a more manageable one (in a future PR), yes, please! We're indeed making this more complex and error-prone than it should be.

Copy link
Member

@RandomShaper RandomShaper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All my thoughts are addressable as follow-ups, so this indeed looks good in its current state given its intended scope.

EDIT: Oh, well, my point one in the comment above should be addressed now, but it's not critical.

@hpvb
Copy link
Member Author

hpvb commented Dec 13, 2024

EDIT: Oh, well, my point one in the comment above should be addressed now, but it's not critical.

I'll zero the padding!

EDIT: all done, this is ready to merge then I think.

@hpvb hpvb force-pushed the fix-rendering-alignment branch from 2d10e68 to 281bd0e Compare December 13, 2024 11:47
@akien-mga
Copy link
Member

akien-mga commented Dec 13, 2024

  1. There's an STEPIFY macro for the same purpose as the newly added _align_command(), but maybe it's not reachable from where this PR needs it. It may be good to put it in some maths or misc functions include. I think other parts of the codebase have their own, so a global fusing of all would be nice.

It should be accessible here actually, it's defined in rendering_device_commons.h which is included by way of rendering_device_graph.h. The formula is slightly different but the results might be equivalent? Didn't do the math.

These misaligned accesses are shown in all of our CI hooks. It turned
out to not be difficult to fix.

It is likely that this will improve performance for aarch64.
@hpvb hpvb force-pushed the fix-rendering-alignment branch from 281bd0e to e674379 Compare December 13, 2024 14:32
@hpvb
Copy link
Member Author

hpvb commented Dec 13, 2024

I remove the _align_command() function and replaced it with STEPIFY

@Repiteo Repiteo merged commit f2c8f17 into godotengine:master Dec 13, 2024
20 checks passed
@Repiteo
Copy link
Contributor

Repiteo commented Dec 13, 2024

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants