Implement seek operation for Theora video files #102360

berarma · 2025-02-03T12:45:44Z

I've implemented the set_stream_position operation for Theora video files and as a consequence also get_length. It also tries to comply with the documentation by not changing the current frame when stop is called and playing seamless loops.

I've made a simple videoplayer project to help test video playing on Godot.

Production edit: This closes Implement get_stream_length() for built-in supported video formats godot-proposals#10148.
See also VideoPlayer set_stream_position() doesn't work (seeking is not implemented) #14430
and [3.x] Support WebM seeking operation #57744.

UPDATE: I've done more testing and added a few more improvements/fixes:

Fix crackling sound when playing videos with 6 channels.
Add 8-channel audio support. (See 5.1 AudioStreams not playing all channels #37536 (comment))
Add stereo down-mixing when playing files with 4/6/8 channels. (See 5.1 AudioStreams not playing all channels #37536 (comment))
Use a permanent audio buffer to avoid copying the same buffer again and again when the output buffer is full.
Dynamic audio buffer size adapted to the number of channels, up to 8K samples,.

berarma · 2025-02-04T13:15:21Z

I've removed a short silence after a seek that was intentional to allow for the audio to sync without having to start decoding audio sooner. It was usually unnoticeable, but I found some files where the silence could be up to 1s long. Now, it will play audio from the very start without any silence.

While working on this, I thought I would explain a bit about how I've implemented the seek function in case anyone is interested. My goal was performance and precision, and at the same time make the code as simple as possible.

Long technical explanation

The only official documentation I could find, apart from the API ref, was this wiki page, but it's pretty vague about the details. I've found some more information and code that helped, but it was either too vague or too confusing for me to fully use, so I started my own implementation. Any code I'd used would have to be heavily adapted anyway.

OGG doesn't use indexes. The basic storage unit is a page, and it uses granules to put time marks on pages. Granulepos is the time unit in OGG containers. For video streams, this granulepos can be easily decoded into a frame number, or a key-frame number and an inter-frame offset. The problem is that not all pages have a granulepos, and when they have one, it's the granulepos of the last complete packet in the page. So finding the granule we need isn't as easy as it may seem, and it involves scanning through several pages until we guess the page it is in.

Something similar happens with audio streams, although granulepos is calculated differently. In this case, it's easier because there are no key-frames or alike.

But in both cases, when scanning, we have to make sure we catch the page where the packet we want starts.

There's an added complication for video. Calculating the key-frame and the inter-frame offset from a granulepos is easy, but the inverse operation isn't. The GOP inside a stream is variable, although most streams don't change it. We only know the max GOP for a stream, so it's impossible to calculate the granulepos from a key-frame number.

~~And one more glitch to account for. I've found at least one file that used duplicate granulepos instead of leaving it blank, so I had to filter this out.~~

The algorithm starts by guessing a position in the file where the key frame we're looking for might be, using the frame duration, the size in bytes of the file, and the stream length in seconds. This gives us a pretty good approximation. Then it starts scanning for the first page, and if we're past the time mark, it backtracks in the file.

The first page with a granulepos just before our calculated granulepos key frame guess is the point where we start decoding. The initial decoding process is done as fast as possible up to the time we're seeking, then resume the normal decoding process.

This algorithm gives very short seek times on my computer (Ryzen 3600) with debug builds. Immediate in some videos and one or two tenths of a second in others. I have to try it yet on optimized builds. It will depend on the GOP used, the higher, the more frames it will have to decode before displaying the first frame.

It could be a bit more optimized by making one more pass to look for the last key frame before decoding. This could improve seek times for streams with lower GOP values. We could also reduce backtracking in the file by fine-tuning the method or by using another method, although I don't think there's much to gain. As it is now, normally, one or too seeks in the file are enough, maxing at 4 or 5 seeks. In both cases, i think the added complexity might not pay back. Edit: After a lot of testing and some improvements I don't think there's much more to gain.

berarma · 2025-02-10T15:07:35Z

I've removed a workaround for a bad video encoding case that I only found in a very old video pre-dating Theora 1.0. It's not worth it.

Calinou

Tested locally using https://commons.wikimedia.org/wiki/File:Big_Buck_Bunny_first_23_seconds_1080p.ogv and https://commons.wikimedia.org/wiki/File:Big_Buck_Bunny_medium.ogv, it works as expected. I made sure sound remains in sync after seeking back and forth.

I don't have an OGV file with more than 2 audio channels to test the downmixing functionality though.

modules/theora/video_stream_theora.cpp

berarma · 2025-02-13T07:30:25Z

Tested locally using https://commons.wikimedia.org/wiki/File:Big_Buck_Bunny_first_23_seconds_1080p.ogv and https://commons.wikimedia.org/wiki/File:Big_Buck_Bunny_medium.ogv, it works as expected. I made sure sound remains in sync after seeking back and forth.

I don't have an OGV file with more than 2 audio channels to test the downmixing functionality though.

This comment has a video with 7.1 audio. I've also tested with this video that has 5.1 audio but it needs to be extracted and converted.

EDIT: I've uploaded the 5.1 video to make it easier.

baldierot · 2025-02-26T09:50:53Z

will this make it into stable?

berarma · 2025-02-26T09:54:02Z

will this make it into stable?

Not in 4.4. Let's hope it's possible for 4.5.

Delsin-Yu · 2025-02-26T09:54:16Z

No, we are at the feature freeze for 4.4, so it will at least be postponed to 4.5, assuming it's approved.

As a side effect, fixes crackling sound when the audio buffer isn't big enough to hold a full Vorbis packet. Not related to the buffer being permanent but a bug in the previous code. It would happen when the video had 6 channel audio tracks.

Calinou · 2025-03-04T21:42:14Z

The CI failure says:

Error: modules\theora\video_stream_theora.cpp(689): error C2220: the following warning is treated as an error
Warning: modules\theora\video_stream_theora.cpp(689): warning C4334: '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)

https://stackoverflow.com/questions/41727042/vs-compilation-warning-result-of-32-bit-shift-implicitly-converted-to-64-bits

berarma · 2025-03-04T21:51:30Z

The CI failure says:
Error: modules\theora\video_stream_theora.cpp(689): error C2220: the following warning is treated as an error
Warning: modules\theora\video_stream_theora.cpp(689): warning C4334: '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)
https://stackoverflow.com/questions/41727042/vs-compilation-warning-result-of-32-bit-shift-implicitly-converted-to-64-bits

Yes, sorry, I've already had one of those. Forgot this time. I'll fix it ASAP. Thanks!

berarma · 2025-03-04T23:03:54Z

@Calinou Done. I was waiting for the checks to run again because they had failed to run before. That's why I haven't fixed it until today. This warning isn't caught by my compiler.

AndreaMonzini · 2025-03-18T14:23:42Z

Hello,
probably you already know but just to inform that since few days there is a new release of Theora 1.2.0beta1:

https://gitlab.xiph.org/xiph/theora/-/commit/69cf61ef163666f53a48374d3e8df7225913194d

I do not know if it can be used soon in production but in the release communication i read also:

A 1.2.0 release is planned in two weeks

and in the release note in particular i read:

Converted SCons setup to work with Python 3
Fixed all known compiler warnings and errors from gcc and clang.

News source :
https://www.phoronix.com/news/Theora-libtheora-1.2-Beta

Maybe it can be useful :)

berarma · 2025-03-18T14:35:54Z

Hello,
probably you already know but just to inform that since few days there is a new release of Theora 1.2.0beta1:

https://gitlab.xiph.org/xiph/theora/-/commit/69cf61ef163666f53a48374d3e8df7225913194d

I do not know if it can be used soon in production but in the release communication i read also:

A 1.2.0 release is planned in two weeks

and in the release note in particular i read:

Converted SCons setup to work with Python 3

Fixed all known compiler warnings and errors from gcc and clang.

Maybe it can be useful :)

Thanks. It's great that they're still updating it. We are using a git version that already has those fixes. This new version has been in the works for years. It would probably be a good time to update after the new stable release.

radlotus · 2025-03-18T15:23:15Z

So sad this didn't make it into 4.4. 4.5 is a while away I bet. I figure the alternative solution is exporting to web and using the browser's video player on top of Godot, but that's just a pain to implement and use for a beginner like me, especially when I want to deploy to mobile.

berarma · 2025-03-18T16:25:12Z

And it's not sure it will make it into 4.5. Maybe more users testing and posting feedback would help, I don't know.

berarma requested review from a team as code owners February 3, 2025 12:45

AThousandShips added enhancement topic:core labels Feb 3, 2025

AThousandShips added this to the 4.x milestone Feb 3, 2025

berarma force-pushed the theora_seek branch from a62f5d5 to 6ced619 Compare February 3, 2025 12:51

Calinou mentioned this pull request Feb 3, 2025

seek theora in progress #32539

Closed

6 tasks

Zireael07 mentioned this pull request Feb 3, 2025

seek theora in progress #32538

Closed

berarma force-pushed the theora_seek branch 3 times, most recently from b337297 to 426b9fd Compare February 4, 2025 12:47

berarma mentioned this pull request Feb 4, 2025

Implement post-processing levels for VideoStreamPlayer and Theora #102410

Draft

berarma force-pushed the theora_seek branch from 426b9fd to 365eac5 Compare February 4, 2025 14:54

berarma requested a review from a team as a code owner February 4, 2025 14:54

berarma mentioned this pull request Feb 4, 2025

Update playing videos godotengine/godot-docs#10614

Merged

berarma force-pushed the theora_seek branch 4 times, most recently from aa77130 to 025611d Compare February 12, 2025 12:22