Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VideoStreamPlayer causes game to crash upon playing video #102982

Closed
0xcafeb33f opened this issue Feb 18, 2025 · 29 comments · Fixed by #103077 or #103176
Closed

VideoStreamPlayer causes game to crash upon playing video #102982

0xcafeb33f opened this issue Feb 18, 2025 · 29 comments · Fixed by #103077 or #103176

Comments

@0xcafeb33f
Copy link
Contributor

Tested versions

  • Reproducible in v4.4.beta4.official [93d2706]
  • Not reproducible in v4.4.beta3.official [06acfcc]

The only VideoStreamPlayer-related commit I saw in beta4 was #101958, so I think it is a likely cause of this issue.

System information

Windows 11 - both Forward+ and Compatibility - both with and without dedicated GPU

Issue description

Game closes without displaying any in-editor error messages upon playing a VideoStreamPlayer in 4.4 beta4.

Steps to reproduce

Play an Ogg Theora stream through a VideoStreamPlayer, either with autoplay=true or play(). Seems to affect both Forward+ and Compatibility renderers (haven't tested Mobile).

Minimal reproduction project (MRP)

videoplayerbug.zip
The main scene can be played to reproduce the issue. In beta3, this works as expected and a blue box moves across the screen. In beta4, the game exits immediately.

@AThousandShips
Copy link
Member

CC @berarma

@berarma
Copy link
Contributor

berarma commented Feb 18, 2025

I can't reproduce it on Linux. The MRP works well here.

I'll try it on a machine with Windows 11. In the mean time if someone could get a crash backtrace it may speed things up.

Does it crash before displaying anything or when the video is already playing?

@arisona
Copy link

arisona commented Feb 18, 2025

Can't reproduce on macOS (using provided videoplayerbug project).

Godot v4.4.beta4 - macOS Sequoia (15.3.1) - Multi-window, 2 monitors - Metal (Forward+) - integrated Apple M1 Pro (Apple7) - Apple M1 Pro (10 threads)

@AThousandShips
Copy link
Member

Can replicate this on Windows 11, will provide a stacktrace soon

@AThousandShips
Copy link
Member

I can't confirm it on a custom build on the same branch, so unsure what is different between the two versions here, so can't provide a stacktrace as it won't occur in a debug build

@Zireael07
Copy link
Contributor

What compiler and compiler version are you using? What compiler does the official beta use?

@berarma
Copy link
Contributor

berarma commented Feb 18, 2025

I can't confirm it on a custom build on the same branch, so unsure what is different between the two versions here, so can't provide a stacktrace as it won't occur in a debug build

Thanks for testing it.

What compiler and compiler version are you using? What compiler does the official beta use?

Looking at the build scripts it seems to be built using mingw-llvm (read comment below from akien-mga).

@AThousandShips
Copy link
Member

Testing a production build currently, using MSVC in both cases, so might be MinGW specific, if it doesn't happen on a production MSVC build someone would need to build using MinGW to test (I don't have an editor build for that set up so it'd be easier if someone using it already can test to avoid the build time)

@akien-mga
Copy link
Member

What compiler and compiler version are you using? What compiler does the official beta use?

Looking at the build scripts it seems to be built using mingw-llvm.

It's actually mingw-gcc 14.2.1 from Fedora 41 for x86_64 and x86_32, and llvm-mingw for arm64 only.

Also relevant, official builds are made with GCC LTO (production=yes), which may be a factor here.

@AThousandShips
Copy link
Member

Doesn't happen in a production=yes MSVC build so likely gcc specific

@berarma
Copy link
Contributor

berarma commented Feb 18, 2025

GCC and Windows specific.

I've been able to reproduce the issue on a Windows 10 VM using the official build, tested that beta3 works on the same VM. Now I'm trying to build a custom version for Windows with D3D12 support but it doesn't run on the same VM because it requries OpenGL3.3 or D3D11. It looks like I haven't built with D3D12 support although I've followed the instructions.

@akien-mga
Copy link
Member

You probably need to build with ANGLE to be able to run on that VM. https://docs.godotengine.org/en/stable/contributing/development/compiling/compiling_for_windows.html#compiling-with-angle-support

@berarma
Copy link
Contributor

berarma commented Feb 18, 2025

I've made a custom build with the command scons platform=windows debug_symbols=yes optimize=debug d3d12=yes angle_libs=/opt/angle/ linker=mold, and it works.

Now I'm building with production=yes.

@berarma
Copy link
Contributor

berarma commented Feb 18, 2025

This is the backtrace and this is the build command: scons platform=windows d3d12=yes angle_libs=/opt/angle/ linker=mold use_static_cpp=yes lto=auto debug_symbols=yes

CrashHandlerException: Program crashed with signal 11
Engine version: Godot Engine v4.4.beta.custom_build (93d270693079ea7802c9e1334a2e0ecd8529eeed)                          Dumping the backtrace. Please include this when reporting the bug to the project developer.
[1] _gnu_exception_handler (./mingw-w64-crt/crt/crt_handler.c:223)
[2] __tmainCRTStartup (./mingw-w64-crt/crt/crtexe.c:321)
[3] WinMainCRTStartup (./mingw-w64-crt/crt/crtexe.c:176)
-- END OF BACKTRACE --

Is this an exception handler crash?

@akien-mga
Copy link
Member

Not sure why the crash handler backtrace is like this, it's probably a separate issue we need to keep looking into @bruvzg.

I ran the reproduction project through gdb and got a more helpful backtrace:

Thread 1 received signal SIGSEGV, Segmentation fault.
0x00000001404242e6 in oc_huff_token_decode_c ()
(gdb) bt
#0  0x00000001404242e6 in oc_huff_token_decode_c ()
#1  0x000000014049b09e in VideoStreamPlaybackTheora::update(double) [clone .part.0] ()
#2  0x0000000142cfd9f2 in VideoStreamPlayer::play() ()
#3  0x00000001449648c1 in Object::notification(int, bool) ()
#4  0x0000000142802d1f in Node::_propagate_enter_tree() ()
#5  0x0000000142803015 in Node::_propagate_enter_tree() ()
#6  0x000000014282a0c1 in Node::_set_tree(SceneTree*) ()
#7  0x000000014006f748 in widechar_main(int, wchar_t**) ()
#8  0x0000000145353ba3 in main ()

Here's how I did it for the record, which might be helpful for @berarma to debug further from Linux:

  • Compiled Godot in a fedora:41 podman container with mingw-gcc and just scons p=windows lto=full debug_symbols=yes
  • Installed wine, and mingw64-gdb, which provides a Windows version of gdb in /usr/x86_64-w64-mingw32/sys-root/mingw/bin/gdb.exe
  • cd to the MRP folder
  • wine /usr/x86_64-w64-mingw32/sys-root/mingw/bin/gdb.exe path/to/godot.windows.editor.x86_64.exe
  • In the gdb prompt, r --headless video_player.tscn

@akien-mga
Copy link
Member

I discussed this with @hpvb who had a hunch that it might be related to another GCC LTO bug we're currently debugging in #102867. The flags suggested in #102867 (comment) seem to solve this crash indeed in my test.

@berarma
Copy link
Contributor

berarma commented Feb 19, 2025

@akien-mga, I got the same backtrace as you. The process was a bit different since I'm on Debian, but I just had to translate your steps.

There seems to be a gap in the backtrace, I'm not directly calling oc_huff_token_decode_c in the code but they're next to each other in the backtrace. Also, I can't see line numbers or the source code, gdb complains that there is no symbol table. Without this information it's hard to know what's happening.

In my tests, setting LTO and default optimization triggered the crash, but it wasn't always the case. Since I couldn't get a good backtrace I put several printf in the code to try to narrow down the code causing the crash and then it stopped crashing. The only change was the printf lines.

If I could get the debug symbols to work I could investigate more, without this, I'm out of ideas to try. I'll test a build with the flags suggested in #102867 to confirm your results and follow your work there.

@berarma
Copy link
Contributor

berarma commented Feb 19, 2025

Confirmed it doesn't crash with the flags in #102867 (comment). Tested also adding d3d12=yes angle=... both on Wine and Windows and it also works.

I've resorted to reading the assembly to find out the line, and this is the line that causes the segmentation fault:

n=_tree[node];

The second argument (_tree) to the function oc_huff_token_decode_c is a bad pointer. This function is called with static int16 arrays defined inside static functions for the second argument, like this:

ret=oc_huff_token_decode(_opb,OC_SB_RUN_TREE);

And this is where the array is defined:

static const ogg_int16_t OC_SB_RUN_TREE[22]={

There are several calls to this function in this file, but since I don't have a complete backtrace I can't tell which one is causing the crash. Most of them use these static arrays, while some others use arrays read from the video headers.

I've tried building with ccflags="-fno-inline" to try and get a complete backtrace, but it fails with these errors:

lto1: error: two or more sections for .gnu.lto__ZZL40_register_variant_builtin_methods_stringvEN28Method_StringName_hex_to_int18get_argument_countEv.34588887.ba4a6b712d465bb9
lto1: error: two or more sections for .gnu.lto__ZZN6embree12parallel_forIyZNS_17ParallelRadixSortINS_4sse212PresplitItemEjE17tbbRadixIterationEjbPKS3_PS3_yEUlyE0_EEvT_RKT0_ENKUlRKNS_5rangeIyEEE_clESG_.9227336.fefa44b2c0907de9
(null):0: confused by earlier errors, bailing out
make: *** [/tmp/ccphOoha.mk:305: /tmp/cc6f9UAr.ltrans101.ltrans.o] Error 1
make: *** Waiting for unfinished jobs....
(null):0: confused by earlier errors, bailing out
make: *** [/tmp/ccphOoha.mk:383: /tmp/cc6f9UAr.ltrans127.ltrans.o] Error 1
./core/templates/sort_array.h: In function 'unguarded_insertion_sort.constprop.isra':
./core/templates/sort_array.h:288:59: warning: iteration 1537228672809129301 invokes undefined behavior [-Waggressive-loop-optimizations]
  288 |                         unguarded_linear_insert(i, p_array[i], p_array);
      |                                                           ^
./core/templates/sort_array.h:287:45: note: within this loop
  287 |                 for (int64_t i = p_first; i != p_last; i++) {
      |                                             ^
thirdparty/misc/polypartition.cpp: In function 'Triangulate_OPT.isra':
thirdparty/misc/polypartition.cpp:711:29: warning: argument 1 value '18446744073709551615' exceeds maximum object size 9223372036854775807 [-Walloc-size-larger-than=]
  711 |   dpstates = new DPState *[n];
      |                             ^
/usr/lib/gcc/x86_64-w64-mingw32/12-posix/include/c++/new:128:26: note: in a call to allocation function 'operator new []' declared here
  128 | _GLIBCXX_NODISCARD void* operator new[](std::size_t) _GLIBCXX_THROW (std::bad_alloc)
      |                          ^
lto-wrapper: fatal error: make returned 2 exit status
compilation terminated.
/usr/bin/x86_64-w64-mingw32-ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status
scons: *** [bin/godot.windows.editor.x86_64.exe] Error 1
scons: building terminated because of errors.

@berarma
Copy link
Contributor

berarma commented Feb 19, 2025

Amended first paragraph in my last comment which didn't make any sense. Had a mental lapse between long builds.

akien-mga added a commit to akien-mga/godot that referenced this issue Feb 20, 2025
…gram`

- Works around and closes godotengine#102867.
- Works around and closes godotengine#102982.

Co-authored-by: Hein-Pieter van Braam-Stewart <hp@tmm.cx>
@akien-mga
Copy link
Member

Could someone who reproduces the issue test these builds from #103077 to confirm that it's fixing the bug?

#103077 (comment)

@0xcafeb33f
Copy link
Contributor Author

Could someone who reproduces the issue test these builds from #103077 to confirm that it's fixing the bug?

#103077 (comment)

I tested the 64-bit version, and it fixes the crash for me.

However, the new build has some odd lag spikes in the video that weren't present in beta3. I'm unsure if this is related to the new build flags. I'll try to make a custom build of beta3 with these flags to confirm if they are the cause.

@berarma
Copy link
Contributor

berarma commented Feb 20, 2025

Could someone who reproduces the issue test these builds from #103077 to confirm that it's fixing the bug?
#103077 (comment)

I tested the 64-bit version, and it fixes the crash for me.

However, the new build has some odd lag spikes in the video that weren't present in beta3. I'm unsure if this is related to the new build flags. I'll try to make a custom build of beta3 with these flags to confirm if they are the cause.

Does this happen with the MRP? Do they play correctly in ffplay?

Videos encoded with FFmpeg could be incorrectly encoded, although they should play the same in ffplay and Godot 4.4beta4.

FFmpeg has already been fixed, but you will need a very recent build.

@0xcafeb33f
Copy link
Contributor Author

Does this happen with the MRP? Do they play correctly in ffplay?

Videos encoded with FFmpeg could be incorrectly encoded, although they should play the same in ffplay and Godot 4.4beta4.

FFmpeg has already been fixed, but you will need a very recent build.

Yes, specifically for the MRP above, on my Windows 11 machine (with RTX3070): beta3 produces smooth video output; rc.1 produces lag spikes. This behavior is repeatable.

The video in the MRP was encoded with ffmpeg version 7.0.1-full_build-www.gyan.dev. I'm not sure what you mean regarding ffmpeg being fixed -- was there a bug in their library for Ogg Theora encoding?

@berarma
Copy link
Contributor

berarma commented Feb 20, 2025

The video in the MRP was encoded with ffmpeg version 7.0.1-full_build-www.gyan.dev. I'm not sure what you mean regarding ffmpeg being fixed -- was there a bug in their library for Ogg Theora encoding?

Yes, there were a couple of bugs in FFmpeg that have been recently fixed in their static daily builds.

Please, re-encode the videos from the source with the newest version to make sure it is not due to those bugs.

@0xcafeb33f
Copy link
Contributor Author

0xcafeb33f commented Feb 20, 2025

Re-encoded with version 2025-02-20-git-bc1a3bfd2c of ffmpeg. New MRP: videoplayerbug.zip

Same issue. beta3 has smooth video playback while rc.1 has lag spikes. I set up beta3 with #103077 to see if the flags are the issue, but it's still compiling.

Update: Not reproducible in both beta3 and beta3 with #103077 .

@berarma
Copy link
Contributor

berarma commented Feb 20, 2025

I can't reproduce on Linux. There are some frame skips on both beta3 and beta4, but that's to be expected on my computer when playing a 1920x1080/144fps video on a decoder without GPU acceleration.

Can you paste the FPS log for both versions?

@0xcafeb33f
Copy link
Contributor Author

Both versions show 144fps consistently for me. I tried setting max fps to 30 and it actually makes the issue worse (the entire video appears to slow down).

Here's a recording of the playback on RC.1 for me:

2025-02-20.15-45-39.mp4

@berarma
Copy link
Contributor

berarma commented Feb 20, 2025

Can you create a new issue with the new MRP and this information? I'm going to investigate it. Thanks!

@0xcafeb33f
Copy link
Contributor Author

The issue of videos playing with lag spikes has been submitted as #103106.

rt9391 pushed a commit to rt9391/rt9391godot2 that referenced this issue Feb 21, 2025
…gram`

- Works around and closes godotengine#102867.
- Works around and closes godotengine#102982.

Co-authored-by: Hein-Pieter van Braam-Stewart <hp@tmm.cx>
rddi-8 pushed a commit to rddi-8/godot-custom-features that referenced this issue Mar 2, 2025
…gram`

- Works around and closes godotengine#102867.
- Works around and closes godotengine#102982.

Co-authored-by: Hein-Pieter van Braam-Stewart <hp@tmm.cx>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment