Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Web] Don't cache emsdk #98965

Merged
merged 1 commit into from
Nov 10, 2024
Merged

Conversation

AThousandShips
Copy link
Member

@AThousandShips AThousandShips commented Nov 8, 2024

Due to how caching works between branches, and this cache being identical for the same version, having this cached doesn't really help. Unfortunately unifying these cache entries doesn't work because of this, so I think it's better to just not cache them at all, downloading emsdk doesn't seem to take much time either (especially since I think CI caches downloads in other ways already) so this won't really hurt performance wise

But saves us a sizable amount of cache space

I've made a 3.x specific version as well and will open a PR for that if this is approved

@AThousandShips AThousandShips added bug enhancement platform:web topic:buildsystem cherrypick:3.x Considered for cherry-picking into a future 3.x release cherrypick:3.5 Considered for cherry-picking into a future 3.5.x release cherrypick:4.1 cherrypick:4.2 Considered for cherry-picking into a future 4.2.x release cherrypick:4.3 Considered for cherry-picking into a future 4.3.x release cherrypick:3.6 Considered for cherry-picking into a future 3.6.x release labels Nov 8, 2024
@AThousandShips AThousandShips added this to the 4.4 milestone Nov 8, 2024
@AThousandShips AThousandShips requested a review from a team as a code owner November 8, 2024 14:17
@AThousandShips
Copy link
Member Author

The alternative solution would be to simply not cache emsdk at all, I don't think that would cost all that much in the runtime of the checks, but I'd say we should either unify the cache entries or not cache at all

@AThousandShips

This comment was marked as resolved.

@AThousandShips AThousandShips marked this pull request as draft November 9, 2024 12:02
@AThousandShips
Copy link
Member Author

Doing some further reading on cache management it seems this cache cannot be read by other branches when created on a branch, so will change this to simply not caching emsdk as it isn't functional and do some further testing

@AThousandShips AThousandShips changed the title [Web] Use single cache per version for emsdk [Web] Don't cache emsdk Nov 9, 2024
@AThousandShips AThousandShips marked this pull request as ready for review November 9, 2024 12:45
@akien-mga
Copy link
Member

Just to clarify the intent of the cache, the original idea was to try to cache only based on the Emscripten version, so all builds that use the same Emscripten version would just reuse that cache. So we'd have at most one cache per Emscripten version (and maybe per Godot base branch?).

But I never really understood in depth how the cache key system works and I'm not sure it ever worked. What did cause issues however was that the Emscripten GH action didn't seem able to cope with two jobs trying to access the same cache at the same time. Or maybe it was upload - where I'd question why we're always reuploading a new version to the cache, ideally it should be treated as static and only require an upload if the data needs to change.

So if we can't make it work like that, removing the cache makes sense to me.


Note that another thing which should (at least I would expect it to) get cached is generated during the build, as you can see at the end of the Compile step:

cache:INFO: generating system asset: symbol_lists/532cf52649ea564c884d63e646c170c4e1bc83d0.json... (this will be cached in "/home/runner/work/_temp/cb745811-a123-4446-b3b0-d22635069551/emsdk-main/upstream/emscripten/cache/symbol_lists/532cf52649ea564c884d63e646c170c4e1bc83d0.json" for subsequent builds)
cache:INFO:  - ok

But this may not take too long to generate so it's probably an ok tradeoff to not do caching for emsdk on CI.

When doing LTO builds, there's a lot more stuff that gets generated at linking time and that can be cached, but that's not a use case we have on CI.

@AThousandShips
Copy link
Member Author

AThousandShips commented Nov 9, 2024

So to clarify some details:

  • The current cache key would need to be changed regardless as it would only match the exact same SHA
  • A cache can only be loaded by either:
    • The branch that created it
    • A merge in a PR that is based on that same branch (i.e. a PR starting from master could access it if created on master, but master could not fetch it from the PR, nor another PR fetch it from that PR)

So to fix this we'd have to create this cache only on individual branches, like master or 3.x and then only fetch in PRs

This could work, not sure exactly how to do CI foo to achieve that but it's possible, it would only update the cache when ever it fails to fetch one, but we'd need to do something to ensure the cache is generated before we run the two web builds or they'd clash.

We could theoretically do this as part of the static checks, but I think such a change, being bound to a specific branch, would be pretty fragile

For the last step of the compilation that doesn't seem to be cached between runs and will look into if we can do so separately, it isn't stored in the emsdk cache path provided by the action so it seems to be separate, but will investigate


To put the cache load of the emsdk cache into perspective we currently have ten entries for emsdk in our caches, totaling around 4.2 GB just from 4.x, and for 3.x just one but 710 MB, so around half of our cache allotment is taken up just by that currently

Due to how caches are accessed this cache is almost useless, it only
matters if it is from the same branch or a base branch, and is identical
between branches, so caching it just clutters the build cache
Copy link
Contributor

@Repiteo Repiteo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To put the cache load of the emsdk cache into perspective we currently have ten entries for emsdk in our caches, totaling around 4.2 GB just from 4.x, and for 3.x just one but 710 MB, so around half of our cache allotment is taken up just by that currently

SHEESH! And here I was splitting hairs over cache efficiency with godot-cpp/plaintext, when this was effectively capping us at 5GB all this time

@AThousandShips
Copy link
Member Author

I'll do some further experimentation with the action and see if we can do some specific caching to help with it, and especially looking into if we can cache the generated system symbols

The latter should be possible to cache along with the SCons cache

Having a single emsdk cache for each branch and version only fetched on PRs should be possible with the system but will see

@AThousandShips
Copy link
Member Author

AThousandShips commented Nov 10, 2024

Discovered an alternative option will investigate

Not trivial to set up so will follow up if I can resolve it

@AThousandShips AThousandShips marked this pull request as draft November 10, 2024 09:03
@AThousandShips AThousandShips marked this pull request as ready for review November 10, 2024 09:18
@akien-mga
Copy link
Member

I'll do some further experimentation with the action and see if we can do some specific caching to help with it, and especially looking into if we can cache the generated system symbols

To be clear, I'm not necessarily saying caching the system symbols is worth it. We should measure how long it takes to generate. If it's less than 15s there's really no need to add extra caching just for that.

@Repiteo Repiteo merged commit eb1d4b2 into godotengine:master Nov 10, 2024
20 checks passed
@Repiteo
Copy link
Contributor

Repiteo commented Nov 10, 2024

Thanks!

@AThousandShips AThousandShips deleted the emsdk_cache_fix branch November 10, 2024 19:39
@AThousandShips
Copy link
Member Author

Thank you!

@AThousandShips AThousandShips removed cherrypick:3.x Considered for cherry-picking into a future 3.x release cherrypick:3.5 Considered for cherry-picking into a future 3.5.x release cherrypick:3.6 Considered for cherry-picking into a future 3.6.x release labels Nov 10, 2024
@akien-mga akien-mga removed the cherrypick:4.2 Considered for cherry-picking into a future 4.2.x release label Mar 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug cherrypick:4.3 Considered for cherry-picking into a future 4.3.x release enhancement platform:web topic:buildsystem
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants