-
-
Notifications
You must be signed in to change notification settings - Fork 22k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix several ubsan reported misaligned accesses #100325
Conversation
@@ -3470,6 +3470,7 @@ Vector<uint8_t> RenderingDeviceDriverVulkan::shader_compile_binary_from_spirv(Ve | |||
offset += sizeof(uint32_t); | |||
encode_uint32(sizeof(ShaderBinary::Data), binptr + offset); | |||
offset += sizeof(uint32_t); | |||
offset += sizeof(uint32_t); // Pad to align ShaderBinary::Data to 8 bytes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make sure we don't get regressions from ShaderBinary::Data
changing size in the future, should we have some static_assert
for the assumption we're making here / above / below?
Just mentioning this to flag it, maybe wait for @clayjohn's input before doing it, as I don't have the full picture of how all this stuff is aligned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also just turn it into a real struct perhaps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also not entirely sure why the size is in there at all since it seems that sizeof(ShaderBinary::Data) is constant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have any strong opinions. CC @RandomShaper who knows this code better than I do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could at least add some getters and setters to a structure with a flexible member so that the framing of the packet doesn't have to implemented twice in two different places? One where the packet gets created, and then again where it gets read?
I don't think that should stop us from merging this in my opinion. This fixes a bug, but leaves the hard to maintain code in place. If desired I can propose a PR to make it easier to maintain separately?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make sure we don't get regressions from
ShaderBinary::Data
changing size in the future, should we have somestatic_assert
for the assumption we're making here / above / below?
The actual size of ShaderBinary::Data
is dynamically determined. And it changing size wouldn't change alignment of it. The problem was that we had it 12 bytes into the structure, but the ShaderBinary::Data
structure has uint64_t vertex_input_mask = 0;
. This means that the whole structure needs to be aligned to 8 bytes, not 4. That is what the extra padding is for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi everyone! Regarding the changing of sizeof(ShaderBinary::Data)
see my reply.
A static_assert would indeed make things safer, but ultimately the responsibility of guarding that is in the function being modified by this PR, because it's a versioned binary data. The PR forgot to bump ShaderBinary::VERSION
.
Of course a static_assert would add an extra layer of reminders "hey! remember to bump ShaderBinary::VERSION
!".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't know what static assert would have caught this, or the version. The size of ShaderBinary::Data
did not change. I added some extra padding to an entirely "free form" memory segment without any structure to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe my suggestion was misguided, I didn't look much into it. My thinking was just that if we can say that adding 4 ensures padding to 8, we're making an assumption on what's the size up to that point (and if that size changes in the future, will the +4 padding still be correct?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@akien-mga yeah, I get it. But due to the way this code works now I don't think there's any way to check it from the "outside" without some kind of unit test. I think it might be better to use packed structs for this and then we could in fact use some assertions to test this potentially.
Hi! Four (minor) issues in your PR:
static constexpr uint32_t _align_command(uint32_t in) {
return (in + 8u - 1u) & ~(8u - 1u);
}
I don't think the _underscore is necessary considering it's a static function only visible to that compilation unit. Or just preprend with |
Okay, that makes sense. Will do.
Only if the ShaderData for D3D12 / Metal has a uint64_t in it. I don't know if it does. If it does I'll make a separate PR
You are entirely right, I don't know what I was thinking.
Godot uses |
8b8e68d
to
2d10e68
Compare
My thoughts:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All my thoughts are addressable as follow-ups, so this indeed looks good in its current state given its intended scope.
EDIT: Oh, well, my point one in the comment above should be addressed now, but it's not critical.
I'll zero the padding! EDIT: all done, this is ready to merge then I think. |
2d10e68
to
281bd0e
Compare
It should be accessible here actually, it's defined in |
These misaligned accesses are shown in all of our CI hooks. It turned out to not be difficult to fix. It is likely that this will improve performance for aarch64.
281bd0e
to
e674379
Compare
I remove the _align_command() function and replaced it with STEPIFY |
Thanks! |
These misaligned accesses are shown in all of our CI hooks. It turned out to not be difficult to fix.
It is likely that this will improve performance for aarch64.