-
-
Notifications
You must be signed in to change notification settings - Fork 21.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SCons: Improve cache purging logic #98154
Conversation
c1e70e6
to
40ee2f4
Compare
Opted to expand the scope of this PR to SCons caching in general, so the following changes were made:
|
For adding command line arguments, see also: |
40ee2f4
to
473d31d
Compare
I like your implementation of enabling the cache based on supplying a path, so I integrated that instead. Still left the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Life is too short to try to understand SCons caching code :P
If it works, I'm happy with it.
Will give this a look tomorrow |
473d31d
to
5b82190
Compare
5b82190
to
1a84f58
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will go over the cache detail code tomorrow but it works correctly for me and looks good at a glance save for these changes
1a84f58
to
011801f
Compare
For some reason the caching is broken on Windows in the runners, it uses |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above
011801f
to
395573a
Compare
I'd be surprised if it was the |
You're right it's not that you need quotes around the argument for Windows just like |
33ee625
to
5ff4a63
Compare
Looks resolved, will review in depth tomorrow
My bad those were the non-valid cached files mentioned in the original, so that should be safe, will just run some tests to ensure it doesn't break compilation stability Might want to add a message to the purge of invalidly cached files separately to make it clear that the cache isn't broken when purging while under the limit but just purges files that shouldn't be cached in the first place Maybe something like: diff --git a/methods.py b/methods.py
index 33fcca5021..73b49775df 100644
--- a/methods.py
+++ b/methods.py
@@ -858,6 +858,7 @@ def clean_cache(cache_path: str, cache_limit: int, verbose: bool):
# Remove all text files, store binary files in list of (filename, size, atime).
purge = []
+ texts = []
stats = []
for file in files:
# Failing a utf-8 decode is the easiest way to determine if a file is binary.
@@ -869,7 +870,18 @@ def clean_cache(cache_path: str, cache_limit: int, verbose: bool):
except OSError:
print_error(f'Failed to access cache file "{file}"; skipping.')
else:
- purge.append(file)
+ texts.append(file)
+
+ if texts:
+ count = len(texts)
+ for file in texts:
+ try:
+ os.remove(file)
+ except OSError:
+ print_error(f'Failed to remove cache file "{file}"; skipping.')
+ count -= 1
+ if verbose:
+ print("Purging %d text %s from cache..." % (count, "files" if count > 1 else "file"))
if cache_limit:
# Sort by most recent access (most sensible to keep) first. Search for the first entry where Edit: tested that and it works, I think it helps clarify what's going on Will go through some more of the code but seems to work great, will make a final review today if I can |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from my suggestion above this works well and looks correct!
Concerns about the deleted text files causing significant reduction in rebuild time
This comment was marked as resolved.
This comment was marked as resolved.
(My bad pushed to the wrong branch, and github is slow in updating the changes for some reason) |
Working through re-compilation tests but initial results are (master and this PR): IOS: 51s -> 7m 06s Linux t/mono: 39s -> 11m 10s Will test by removing the text purging and see if that is specifically what breaks this or not Edit: Did some testing and doesn't seem to be due to the purging of text files, will have to deeper, but something is broken with caching it seems on plain re-runs |
I think I've got it, it seems to be due to the moving of the cache initialization to the end of Edit: Tested and moving the initialization of the cache up to where it used to be (where Changes required: diff --git a/SConstruct b/SConstruct
index f7fe98cb5c..8da4246fbd 100644
--- a/SConstruct
+++ b/SConstruct
@@ -1043,6 +1043,8 @@ GLSL_BUILDERS = {
}
env.Append(BUILDERS=GLSL_BUILDERS)
+methods.prepare_cache(env)
+
if env["compiledb"]:
env.Tool("compilation_db")
env.Alias("compiledb", env.CompilationDatabase())
@@ -1095,7 +1097,6 @@ if "check_c_headers" in env:
methods.show_progress(env)
-methods.prepare_cache(env)
# TODO: replace this with `env.Dump(format="json")`
# once we start requiring SCons 4.0 as min version.
methods.dump(env) Unsure if there's any issues in running some of that code earlier but in my testing this works and solves the problem (will confirm with the text purging restored as well) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above again
There, lots of back and forth but this would be my final suggestion for changes, this retains the purging of text files, prints them more clearly, and fixes rebuild times: diff --git a/SConstruct b/SConstruct
index f7fe98cb5c..8da4246fbd 100644
--- a/SConstruct
+++ b/SConstruct
@@ -1043,6 +1043,8 @@ GLSL_BUILDERS = {
}
env.Append(BUILDERS=GLSL_BUILDERS)
+methods.prepare_cache(env)
+
if env["compiledb"]:
env.Tool("compilation_db")
env.Alias("compiledb", env.CompilationDatabase())
@@ -1095,7 +1097,6 @@ if "check_c_headers" in env:
methods.show_progress(env)
-methods.prepare_cache(env)
# TODO: replace this with `env.Dump(format="json")`
# once we start requiring SCons 4.0 as min version.
methods.dump(env)
diff --git a/methods.py b/methods.py
index 33fcca5021..73b49775df 100644
--- a/methods.py
+++ b/methods.py
@@ -858,6 +858,7 @@ def clean_cache(cache_path: str, cache_limit: int, verbose: bool):
# Remove all text files, store binary files in list of (filename, size, atime).
purge = []
+ texts = []
stats = []
for file in files:
# Failing a utf-8 decode is the easiest way to determine if a file is binary.
@@ -869,7 +870,18 @@ def clean_cache(cache_path: str, cache_limit: int, verbose: bool):
except OSError:
print_error(f'Failed to access cache file "{file}"; skipping.')
else:
- purge.append(file)
+ texts.append(file)
+
+ if texts:
+ count = len(texts)
+ for file in texts:
+ try:
+ os.remove(file)
+ except OSError:
+ print_error(f'Failed to remove cache file "{file}"; skipping.')
+ count -= 1
+ if verbose:
+ print("Purging %d text %s from cache..." % (count, "files" if count > 1 else "file"))
if cache_limit:
# Sort by most recent access (most sensible to keep) first. Search for the first entry where Needs some testing to confirm the placement of the cache initialization though, might need to be split into one part setting |
Hmm, if putting the cache handling later caused longer build times, would placing it even earlier improve it? Might give that a spin |
5ff4a63
to
a783098
Compare
• Implement caching via SCons arguments, rather than environment variables
a783098
to
0e4a4e3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything looks good now and my version of the same solution works well
So now enabling cache is just about setting these two options? |
If you just wanna enable it, all you need is the path option. If you don't care about size you can leave the limit option alone |
Been banging my head against a wall trying to figure out the nuance of SCons' caching systems, and came out with two key takeaways:
methods.py
implementation is very messy and riddled with legacy codeNoCache
, particularly text filesThankfully, these go hand-in-hand, as a refactor of the former allowed me to add a dedicated pass for the latter. After compression, it ends up being a ~5MB reduction in cache size per build. Beyond that, there was general simplification of the code & making it more readable, as well as removing a redundant key check in the cache-restore action. I wholly believe that this can be further improved, particularly if binary files are being erroneously cached as well, but anything beyond this will require a deeper dive