-
Notifications
You must be signed in to change notification settings - Fork 681
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch build_test_all_bazel to new dockerfile and runners. #18533
Conversation
bc3bd09
to
0ebc897
Compare
Ooooooof #16915 is hurting compile time a lot here. Shouldn't need to build all the LLVM targets. |
Got a successful build... only took 55 minutes on a 96 core machine 🤦♂️. Could try updating Bazel and/or using a ramdisk. This might be the old issue where Bazel performance is pessimized on systems with high core counts. My local build (similar high power machine) finished in closer to 20-30 minutes. We can also prune build deps, or try getting a new remote cache spun up. OR just drop Bazel support / move it down a support tier to nightly builds. Ehhhh, no good options, just different compromises and they all take time. |
env: | ||
IREE_CUDA_DEPS_DIR: /usr/local/iree_cuda_deps | ||
run: | | ||
./build_tools/bazel/install_bazelisk.sh 1.21.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered installing Bazel or Bazelisk in the cpubuilder dockerfile in iree-org/base-docker-images#9. Decided against it for now, to keep the dockerfile simpler.
cp ./build_tools/docker/context/fetch_cuda_deps.sh /usr/local/bin | ||
/usr/local/bin/fetch_cuda_deps.sh ${IREE_CUDA_DEPS_DIR} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These CUDA deps were part of the base-bleeding-edge dockerfile:
iree/build_tools/docker/dockerfiles/base-bleeding-edge.Dockerfile
Lines 93 to 97 in 3a62d5c
######## IREE CUDA DEPS ######## | |
ENV IREE_CUDA_DEPS_DIR="/usr/local/iree_cuda_deps" | |
COPY build_tools/docker/context/fetch_cuda_deps.sh /usr/local/bin | |
RUN /usr/local/bin/fetch_cuda_deps.sh "${IREE_CUDA_DEPS_DIR}" | |
############## |
See the notes in build_tools/bazel/workspace.bzl
too.
I don't care enough right now to refactor how the Bazel build handles CUDA, so choosing the path of least resistance and putting more logic in the workflow.
on: | ||
pull_request: | ||
paths: | ||
- ".github/workflows/ci_linux_x64_bazel.yml" | ||
schedule: | ||
# Weekday mornings at 09:15 UTC = 01:15 PST (UTC - 8). | ||
- cron: "15 9 * * 1-5" | ||
workflow_dispatch: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed how frequently this job should run here on Discord.
We settled on a compromise for now:
- Run nightly instead of on every commit
- Look at setting up a remote build cache on the same nginx server that we use for CMake's ccache/sccache storage (using webdav)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Progress on #15332 and #18238. Fixes #16915.
This switches the
build_test_all_bazel
CI job from thegcr.io/iree-oss/base-bleeding-edge
Dockerfile using GCP for remote cache storage to theghcr.io/iree-org/cpubuilder_ubuntu_jammy_x86_64
Dockerfile with no remote cache.With no cache, this job takes between 18 and 25 minutes. Early testing also showed times as long as 60 minutes, if the Docker command and runner are both not optimally configured for Bazel (e.g. not using a RAM disk).
The job is also moved from running on every commit to running on a nightly schedule while we evaluate how frequently it breaks and how long it takes to run. If we set up a new remote cache (https://bazel.build/remote/caching), we can move it back to running more regularly.