Bazel fails for cache deadline exceeded and cache unavailable errors #24120
Labels
P2
We'll consider working on this in future. (Assignee optional)
team-Remote-Exec
Issues and PRs for the Execution (Remote) team
type: bug
Description of the bug:
Bazel seems to be unable to recover if inputs to a given action cannot be downloaded from cache, while it recovers in other cases.
Bazel fails with either:
Failed to fetch blobs because of a remote cache error.: io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED
Failed to fetch blobs because of a remote cache error.: io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
Tested with a few versions of bazel (7.1.1, 7.4.0, last_green)
On
last_green
version bazel actually retries actions possibly due to--experimental_remote_cache_eviction_retries
flag, but always fails if above issues re-appear.Included examples where bazel always fails.
Note that none of those options helps in the above example:
--remote_local_fallback
,--experimental_remote_cache_eviction_retries
.Which category does this issue belong to?
Remote Execution
What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
MODULE.bazel
:BUILD
:Patched version of remote-cache https://github.com/buchgr/bazel-remote that just sleeps on blob with "first_content":
Running cache:
Running bazel:
On second execution of bazel:
For cache that fails during blob retireval, using this patch of bazel-remote:
Running bazel:
Which operating system are you running Bazel on?
Ubuntu 20.04
What is the output of
bazel info release
?release 7.4.0
Have you found anything relevant by searching the web?
No local fallback after cache timeout - This is possibly related issue, possibly this one is duplicate (although error messages are slightly different)
Rethink spawn strategies - A bag of ideas how caching can change going forward
The text was updated successfully, but these errors were encountered: