-
Notifications
You must be signed in to change notification settings - Fork 673
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update LLVM to llvm/llvm-project@b13592219c421820b #19554
Conversation
@saienduri I regenerated the "outdated" mlirbc files and uploaded them as IR into the same name and directory but with text and as ".mlir" suffix. LMK what you think! :) |
@ScottTodd In order to drop the reverts and not keep it for too long, I updated the mlirbc of open-llama in SHARK-TestSuite and kept in in a personal branch https://github.com/nod-ai/SHARK-TestSuite/tree/raikonen/integrate/llvm20241220 (which is 1 commit on top of the current SHA being pinned fore SHARK-TestSuite) Here is how I regenerated the MLIRBC (which should preserve most of the information from the old mlirbc): iree-opt open-llama-3b-v2-f16.mlirbc -o open-llama-3b-v2-f16.mlir
rm open-llama-3b-v2-f16.mlirbc
iree-opt open-llama-3b-v2-f16.mlir --emit-bytecode -o open-llama-3b-v2-f16.mlirbc LMK what you think! |
@raikonenfnu let's revert the tile and fuse change and land it. @Groverkss you can add your changes and undo the revert |
02591dc
to
dae2304
Compare
Update LLVM to llvm/llvm-project@b13592219c421820b (llvm/llvm-project#85376) Changes done to resolve mlirbc issue in iree-org#19498 This PR also carries the following reverts: llvm/llvm-project#120115 The main issue with this PR is it breaks matvec codegen generating scf.if instead of scf.for(s). An issue will be pushed up for repro. Signed-off-by: Stanley Winata <stanley.winata@amd.com>
Signed-off-by: Stanley Winata <stanley.winata@amd.com>
Signed-off-by: Stanley Winata <stanley.winata@amd.com>
Signed-off-by: Stanley Winata <stanley.winata@amd.com>
ref: f5615ab29da491c0047146258dfa3a0c40c735e5 | ||
ref: 601db0e472600a94ddb69b37d05cd7d4a17f89b2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ScottTodd In order to drop the reverts and not keep it for too long, I updated the mlirbc of open-llama in SHARK-TestSuite and kept in in a personal branch https://github.com/nod-ai/SHARK-TestSuite/tree/raikonen/integrate/llvm20241220 (which is 1 commit on top of the current SHA being pinned fore SHARK-TestSuite)
Here is how I regenerated the MLIRBC (which should preserve most of the information from the old mlirbc):
iree-opt open-llama-3b-v2-f16.mlirbc -o open-llama-3b-v2-f16.mlir rm open-llama-3b-v2-f16.mlirbc iree-opt open-llama-3b-v2-f16.mlir --emit-bytecode -o open-llama-3b-v2-f16.mlirbcLMK what you think!
I deleted that test: nod-ai/SHARK-TestSuite#418
I don't have a replacement ready for the two tests here yet though: https://github.com/nod-ai/SHARK-TestSuite/tree/main/iree_tests/pytorch/models. We can delete this either before or after having a replacement:
iree/.github/workflows/pkgci_regression_test.yml
Lines 21 to 117 in fa325c5
test_models: | |
name: "test_models :: ${{ matrix.name }}" | |
runs-on: ${{ matrix.runs-on }} | |
strategy: | |
fail-fast: false | |
# Note: these jobs should use persistent runners with local caches. | |
# Downloading test files (50GB+) without a cache can take 20+ minutes. | |
matrix: | |
include: | |
# CPU | |
- name: cpu_llvm_task | |
models-config-file: models_cpu_llvm_task.json | |
runs-on: | |
- self-hosted # must come first | |
- persistent-cache | |
- Linux | |
- X64 | |
# AMD GPU | |
- name: amdgpu_rocm_mi250_gfx90a | |
models-config-file: models_gpu_rocm_gfx90a.json | |
runs-on: nodai-amdgpu-mi250-x86-64 | |
- name: amdgpu_rocm_mi300_gfx942 | |
models-config-file: models_gpu_rocm_gfx942.json | |
runs-on: nodai-amdgpu-mi300-x86-64 | |
- name: amdgpu_vulkan | |
models-config-file: models_gpu_vulkan.json | |
runs-on: nodai-amdgpu-w7900-x86-64 | |
# NVIDIA GPU | |
# None at the moment. Could maybe use the persistent a100 runners: | |
# - self-hosted # must come first | |
# - runner-group=${{ needs.setup.outputs.runner-group }} | |
# - environment=${{ needs.setup.outputs.runner-env }} | |
# - a100 | |
# - os-family=Linux | |
# (note: would need to plumb the presubmit/postsubmit runner-group through to here too) | |
env: | |
PACKAGE_DOWNLOAD_DIR: ${{ github.workspace }}/.packages | |
IREE_TEST_PATH_EXTENSION: ${{ github.workspace }}/build_tools/pkgci/external_test_suite | |
MODELS_CONFIG_FILE_PATH: build_tools/pkgci/external_test_suite/${{ matrix.models-config-file }} | |
VENV_DIR: ${{ github.workspace }}/venv | |
steps: | |
- name: Checking out IREE repository | |
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2 | |
with: | |
submodules: false | |
- uses: actions/setup-python@0b93645e9fea7318ecaed2b359559ac225c90a2b # v5.3.0 | |
with: | |
# Must match the subset of versions built in pkgci_build_packages. | |
python-version: "3.11" | |
- uses: actions/download-artifact@fa0a91b85d4f404e444e00e005971372dc801d16 # v4.1.8 | |
with: | |
name: linux_x86_64_release_packages | |
path: ${{ env.PACKAGE_DOWNLOAD_DIR }} | |
- name: Setup venv | |
run: | | |
./build_tools/pkgci/setup_venv.py ${VENV_DIR} \ | |
--artifact-path=${PACKAGE_DOWNLOAD_DIR} \ | |
--fetch-gh-workflow=${{ inputs.artifact_run_id }} | |
# Out of tree tests | |
- name: Check out external TestSuite repository | |
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2 | |
with: | |
repository: nod-ai/SHARK-TestSuite | |
ref: 601db0e472600a94ddb69b37d05cd7d4a17f89b2 | |
path: SHARK-TestSuite | |
submodules: false | |
lfs: true | |
- name: Install external TestSuite Python requirements | |
run: | | |
source ${VENV_DIR}/bin/activate | |
python3 -m pip install -r SHARK-TestSuite/iree_tests/requirements.txt | |
pip install --no-compile --pre --upgrade -e SHARK-TestSuite/common_tools | |
- name: Download remote files for real weight model tests | |
run: | | |
source ${VENV_DIR}/bin/activate | |
python SHARK-TestSuite/iree_tests/download_remote_files.py --root-dir iree_tests/pytorch/models | |
python SHARK-TestSuite/iree_tests/download_remote_files.py --root-dir iree_tests/sharktank | |
- name: Run external tests - models with real weights | |
if: "matrix.models-config-file != '' && !cancelled()" | |
run: | | |
source ${VENV_DIR}/bin/activate | |
pytest \ | |
SHARK-TestSuite/iree_tests/pytorch/models \ | |
SHARK-TestSuite/iree_tests/sharktank \ | |
-rA \ | |
-k real_weights \ | |
--no-skip-tests-missing-files \ | |
--capture=no \ | |
--log-cli-level=info \ | |
--timeout=600 \ | |
--durations=0 \ | |
--config-files=${MODELS_CONFIG_FILE_PATH} |
sd3_clip_mlir = fetch_source_fixture( | ||
"https://sharkpublic.blob.core.windows.net/sharkpublic/sai/sd3-prompt-encoder/model.mlirbc", | ||
"https://sharkpublic.blob.core.windows.net/sharkpublic/sai/sd3-prompt-encoder/model.mlir", | ||
group="sd3_clip", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@saienduri I regenerated the "outdated" mlirbc files and uploaded them as IR into the same name and directory but with text and as ".mlir" suffix. LMK what you think! :)
If continuing to use Azure for these files (remember, this is "experimental" code so the bar is currently low for organization and code quality), I'd rather we not use individual user names and do use dates or versions, so a path like
sharkpublic/sharktank_tests/models/sd3/prompt-encoder/model_2024_12_23.mlirbc
sharkpublic/sharktank_tests/models/sd3/prompt-encoder/model_v3.1.0rc20241223.mlirbc
sharkpublic/sharktank_tests/models/sd3-prompt-encoder-v3.1.0rc20241223.mlirbc
sharkpublic/sharktank_tests/models/sd3/prompt-encoder/sd3-prompt-encoder-v3.1.0rc20241223.mlirbc
Update LLVM to llvm/llvm-project@b13592219c421820b (llvm/llvm-project#85376) Changes done to resolve mlirbc issue in #19498 through updating/regen of input IRs in azure and SHARK-TestSuite to work with latest mlir-opt.
This PR also carries the following reverts:
llvm/llvm-project#120115
llvm/llvm-project#119461
The main issue with PR 120115 is it breaks matvec codegen generating scf.if instead of scf.for(s). An issue will be pushed up for repro.
The main issue with PR 119461 is it breaks e2e riscv test by making it get stuck on infinite loop.