Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip flaky test_top_k onnx op test on rdna3. #20327

Merged
merged 2 commits into from
Mar 20, 2025

Conversation

ScottTodd
Copy link
Member

Disabled due to flaky behavior tracked at #18649.

Recent logs: https://github.com/iree-org/iree/actions/runs/13957484655/job/39072362817#step:8:51

=================================== FAILURES ===================================
___ IREE compile and run: test_top_k::model.mlir::model.mlir::gpu_rocm_rdna3 ___
[gw0] linux -- Python 3.11.9 /home/esaimana/actions-runner-2/_work/iree/iree/venv/bin/python
Error invoking iree-run-module
Error code: 1
Stderr diagnostics:

Stdout diagnostics:
EXEC @test_top_k
[FAILED] result[1]: element at index 5 (0) does not match the expected (1); expected that the view is equal to contents of a view of 3x3xi64
  expected:
3x3xi64=[3 2 1][3 2 1][3 2 1]
  actual:
3x3xi64=[3 2 1][3 2 0][3 2 1]

Test case source:
  https://github.com/iree-org/iree-test-suites/blob/main/onnx_ops/onnx/node/generated/test_top_k

Input program:
```
module {
  func.func @test_top_k(%arg0: !torch.vtensor<[3,4],f32>, %arg1: !torch.vtensor<[1],si[64](https://github.com/iree-org/iree/actions/runs/13957484655/job/39072362817#step:8:65)>) -> (!torch.vtensor<[3,3],f32>, !torch.vtensor<[3,3],si64>) attributes {torch.onnx_meta.ir_version = 6 : si64, torch.onnx_meta.opset_version = 17 : si64, torch.onnx_meta.producer_name = "backend-test", torch.onnx_meta.producer_version = ""} {
    %none = torch.constant.none
    %0:2 = torch.operator "onnx.TopK"(%arg0, %arg1) {torch.onnx.axis = 1 : si64} : (!torch.vtensor<[3,4],f32>, !torch.vtensor<[1],si64>) -> (!torch.vtensor<[3,3],f32>, !torch.vtensor<[3,3],si64>) 
    return %0#0, %0#1 : !torch.vtensor<[3,3],f32>, !torch.vtensor<[3,3],si64>
  }
}

```

Compiled with:
  cd /home/esaimana/actions-runner-2/_work/iree/iree/iree-test-suites/onnx_ops/onnx/node/generated/test_top_k && iree-compile model.mlir --iree-hal-target-backends=rocm --iree-hip-target=gfx1100 --iree-input-demote-f64-to-f32=false -o model_gpu_rocm_rdna3.vmfb

Run with:
  cd /home/esaimana/actions-runner-2/_work/iree/iree/iree-test-suites/onnx_ops/onnx/node/generated/test_top_k && iree-run-module --module=model_gpu_rocm_rdna3.vmfb --device=hip --flagfile=run_module_io_flags.txt

ci-exactly: build_packages, test_onnx

Copy link
Collaborator

@benvanik benvanik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feel better about disabling now that the issue is being looked at - maybe tag the issue in the json? (hopefully it's jsonc?)

Signed-off-by: Scott Todd <scott.todd0@gmail.com>
@ScottTodd
Copy link
Member Author

It is jsonc, but the files are also autogenerated, and the autogeneration does not preserve comments (or other local edits like ordering changes). I put enough context in here for the blame layer to be useful.

@benvanik
Copy link
Collaborator

ahh gotcha!

Signed-off-by: Scott Todd <scott.todd0@gmail.com>
@ScottTodd ScottTodd marked this pull request as ready for review March 20, 2025 17:29
@ScottTodd ScottTodd added codegen/rocm ROCm code generation compiler backend (HIP/HSA) hal/amdgpu Runtime AMDGPU HAL backend labels Mar 20, 2025
@ScottTodd ScottTodd merged commit d84ab88 into iree-org:main Mar 20, 2025
29 checks passed
@ScottTodd ScottTodd deleted the flaky-topk branch March 20, 2025 17:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
codegen/rocm ROCm code generation compiler backend (HIP/HSA) hal/amdgpu Runtime AMDGPU HAL backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants