-
-
Notifications
You must be signed in to change notification settings - Fork 22k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dereferencing a funcref from the function it calls leads to a heap-use-after-free #75658
Comments
I've run into this issue at least a dozen separate times now in unrelated parts of my codebase, in multiple projects, in Godot 3 and in Godot 4 as well. I seriously can't be the only one encountering this 😆 Updated example that works with Godot 4.3: Build Godot with asan, run the callable_example scene and observe the asan error:
This sometimes leads to a crash in non-asan builds (in editor, debug, and in release template builds), but it's very difficult to reproduce that way. It happens because a RefCounted object's refcount can go to zero, causing the object to be freed, while one of its methods is still somewhere on the stack. When the stack unwinds to that method, there's a heap-use-after-free. The pattern of use that tends to hit this bug in real code is one where there's an object that emits a signal when it's finished - and where the consumer of that signal clears the sole reference to that object (whether intentionally, incidentally, or indirectly). The other example in that project is more along these lines, and demonstrates just how insidious this bug can be:
When I look at this, I don't tend to think that when Edit to add: I should say it can be worked around by carefully re-writing code. But users can only do that if they know the bug is there, and what causes the bug. |
CC @godotengine/gdscript @godotengine/core |
This bug seems caused by Likely this is holding onto a reference to the object and then trying to change a ref count after deletion. Removing the line Still investigating. It likely needs something like a temporary bumped ref count to defer deletion until both the call and the UPDATE: The bodge fix BTW is just storing the There's already some protection against calling
so this problem has come up before in some form but doesn't protect against this type of deletion. |
|
Actually the bodge fix might be not so bad after all. We could test whether the object is a Looking up instance ids is probably super slow but it seems debug only. But using it for every There's also the problem that if we did increase the refcount to prevent this (debug) bug, we would probably need to do it in release too because the lifetime would change subtly, and could lead to further hard to debug bugs in the future. So maybe the bodge is better. Maybe there's a more elegant way though. |
It seems we crossed our ways. I've made a fix based on |
That looks good I'll check in a second. 👍 Ah yes, your PR looks to be doing the same as my suggestion with the ObjectDB check under the hood. I don't think we have this function in 3.x, so might have to be more as above. On the other hand I'm wondering now if we can do this more efficiently:
This depends on the pattern being single threaded (no idea if it is, and this may differ in 3.x and 4.x). If not then maybe the ObjectDB may be better. UPDATE: |
Another issue would be the predelete handler. When the object's refcount reaches 0, it frees the object, triggering the predelete_handler and NOTIFICATION_PREDELETE. That could then do a method call and re-increment the refcount. I think this would cause an infinite loop unless how the predelete handler works was also changed somehow?
EDIT: I looked into whether this would cause an issue, and it appears that deletion doesn't occur until the end of the |
I know next to zero about the |
Godot version
3.5.1.stable
System information
Arch Linux (rolling release)
Issue description
Dereferencing a funcref from the function it calls can cause an error in AddressSanitizer in debug/release-debug builds. It might be possible for this to cause more subtle heap corruption issues in release builds--I suspect it is behind a rare/unreproducible crash in my project.
I think what happens in the below example is that when
foobar
dereferences the funcref, the funcref is freed. Then when control flow returns fromfoobar
toFuncRef.call_func
thethis
pointer is invalid. It seems weird that function calls do not implicitly hold a reference to the object they're on.Note that this isn't unique to FuncRefs - you can implement a FuncRef in pure GDScript and it will have the same bug.
Steps to reproduce
Put this script on a node and then play the scene, using a release_debug build with asan on:
Minimal reproduction project
N/A
The text was updated successfully, but these errors were encountered: