Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update branch #161

Merged
merged 61 commits into from
Apr 12, 2024
Merged

Update branch #161

merged 61 commits into from
Apr 12, 2024

Conversation

saienduri
Copy link
Contributor

No description provided.

zjgarvey and others added 30 commits March 12, 2024 16:45
Add initial structure for gemma through e2eshark
Models for which ONNX to torch to IREE CPU run pass
Passing tests
After upstream
[torch-mlir#3009](llvm/torch-mlir#3009) lands:

DynamicQuantizeLinear will fail due to uint8 being unsupported in
unpackbytearray (will post an issue).

DynamicQuantizeLinearCast will also fail unless
[torch-mlir#3018](llvm/torch-mlir#3018) also
lands.
Adds support for unpacking uint8 types.
These let you use options from config files but ignore xfail and skip
run settings to easily get logs with all test failures, like these:

*
[pytest_compile_only_2024_03_21.txt](https://gist.github.com/ScottTodd/ecc9c57c01bfc5e996a15cdd38df6a9c)
*
[pytest_all_failures_2024_03_21.txt](https://gist.github.com/ScottTodd/1a02531cc76a3b8566428207e39d1870)

Both of those are using `IREE compiler version 20240321.838 @
5f2743baaa79160a3883854e60e5188822dceeb1`

(This was previously possible by editing the config .json file, but now
it's easier)

So minimal repro instructions for latest nightly IREE ONNX test suite
failures:

```
git clone https://github.com/nod-ai/SHARK-TestSuite.git
cd SHARK-TestSuite
python -m venv .venv
source .venv/bin/activate
python -m pip install -r iree_tests/requirements.txt
python -m pip install --find-links https://iree.dev/pip-release-links.html iree-compiler iree-runtime --upgrade
pytest iree_tests/onnx -n auto --ignore-xfails --config-files ./iree_tests/configs/config_cpu_llvm_sync.json
```

See the README for other common scenarios (like using a source build of
IREE)
ScottTodd and others added 29 commits March 26, 2024 13:53
This commit plugs in turbine tank exported mlir to the iree testing
framework for llama + sd models.
Couldn't add real weight test for llama. Real weight file is 20GB and
crashing the runner when trying.
On that note, looks like splat test cases aren't running at the moment?
Maybe we can enable the real weight test after we get the cluster.
Also, just in case it's useful for anyone adding models in future, for
the exported model mlir, I had to update real weight flag to
`--parameters=model=real_weights.irpa` to get it working.
I added a testcase for SiLU as of Elliott's request.
The model.onnx comes from running the QuantizedMLP_basic test in
torch-mlir with the onnx config.

With changes in [torch-mlir
PR#3089](llvm/torch-mlir#3089), this test with
--torchtolinalg flag still fails due to issues with mixed signedness of
quantized matmul arguments. See
[torch-mlir#3090](llvm/torch-mlir#3090).
Signed-off-by: Gaurav Shukla<gashukla@amd.com>
Also simplify the GPU config, since it is just a reference file.
Follow-up to
#126 (comment)

All I needed to do for generating this PR was pull changes (having git
lfs installed). I think to avoid this in the future, others working with
these files should also install git lfs.
…om az storage. Add template model.py for tests
If a test unexpectedly passes the compilation phase, this now keeps
going and tests the runtime. Depending on the outcome, a specific
message is logged allowing for easier updating of config files. Updating
is still a manual process, but it now requires less moving test cases
back and forth between XFAIL lists.

Sample logs:
```
________ IREE compile and run: test_averagepool_2d_precomputed_strides::cpu_llvm_sync_test ________ [gw20] win32 -- Python 3.11.2 D:\dev\projects\SHARK-TestSuite\iree_tests\nightly_pip.venv\Scripts\python.exe
[XPASS(strict)] Expected compile to fail (remove from 'expected_compile_failures')
```

```
___ IREE compile and run: test_dynamicquantizelinear_min_adjusted_expanded::cpu_llvm_sync_test ____ [gw50] win32 -- Python 3.11.2 D:\dev\projects\SHARK-TestSuite\iree_tests\nightly_pip.venv\Scripts\python.exe
Expected compile failure but run failed (move to 'expected_run_failures'):
Error invoking iree-run-module
Error code: 1
Stderr diagnostics:

Stdout diagnostics:
EXEC @test_dynamicquantizelinear_min_adjusted_expanded
[FAILED] result[0]: metadata is 3x4xi8; expected that the view matches 3x4xui8; expected that the view is equal to contents of a view of 3x4xui8
  expected:
3x4xui8=[64 134 83 159][213 255 96 166][249 255 191 149]
  actual:
3x4xi8=[64 -122 83 -97][-43 -1 96 -90][-7 -1 -65 -107]
[FAILED] result[2]: metadata is i8; expected that the view matches ui8; expected that the view is equal to contents of a view of ui8
  expected:
ui8=0
  actual:
i8=0

Compiled with:
  cd D:\dev\projects\SHARK-TestSuite\iree_tests\onnx\node\generated\test_dynamicquantizelinear_min_adjusted_expanded && iree-compile model.mlir --iree-hal-target-backends=llvm-cpu -o model_cpu_llvm_sync_test.vmfb

Run with:
  cd D:\dev\projects\SHARK-TestSuite\iree_tests\onnx\node\generated\test_dynamicquantizelinear_min_adjusted_expanded && iree-run-module --module=model_cpu_llvm_sync_test.vmfb --device=local-sync --flagfile=test_data_flags.txt
```
Sounds like this is the more standard form to include.
This commit adds support for testing all sdxl submodels
RAFT_vaiq_int8
RDN_pytorch_vaiq_int8
pytorch-3dunet_vaiq_int8
MobileNetV3_small_vaiq_int8

Generated issues #598 #599
1. opt-125m is not in azure storage yet. I'll see if I can figure out
how to add it.
2. using past_sequence_length != 0 results in an onnx runtime error that
looks virtually identical to this issue:
[onnx#6130](microsoft/onnxruntime#6130). I'm
not sure how the onnx file was generated, but perhaps there is an option
mentioned in that thread which can fix the issue.
3. using past_sequence_length = 0 will successfully run in onnx, but
fails in torch-mlir due to shape issues present in the torch-onnx-mlir
ir (possible importer issue).
This commit adds the V0 support for testing all sdxl submodels. This PR
is not a long term solution for benchmarking as Scott and I discussed
here: #152. This is the
result of a request to get sdxl benchmarking in ASAP by our team. Due to
the high priority for this to be added to the sdxl testing as our team
lands patches in IREE, this is simply just to get the implementation in
and working. Scott and I discussed some more intensive and well
structured ways to add benchmarking, which either of us may implement in
the future.

Also, this PR depends on this one in terms of landing:
#152 (hence the CI
failure)

Notes for future if we decide that we need a stronger implementation: 

1. Maybe something like iree-org/iree#16965 which
will feed into https://perf.iree.dev/.
2. This is the benchmarking framework we already have:
https://github.com/openxla/iree-comparative-benchmark and
https://github.com/openxla/iree/tree/main/build_tools/benchmarks
3. Some questions Scott raised to keep in mind for future
implementation:
* What metrics/artifacts do we want from benchmarking?
  * Each model in isolation? Full pipeline latency? Just dispatch time?
* What do we want done with benchmark results / artifacts?
* The in-tree benchmarks in IREE submit results to a dashboard (that
should use a queryable database...), upload Tracy files to cloud
storage, and comment on pending pull requests with results summaries
* Where do we want benchmarks to run?
  * Right after tests, on presubmit to IREE?
  * In a separate job, on separate runners?
If we decide benchmarking needs changes, we will address all of these
and come up with a more structured, methodical implementation that
either creates a new benchmarking flow here or plugs into the iree
benchmarking setup.
- adds support for downloading and running the e2e tests for onnx models

- currently supported models: 
    - onnx/models/DarkNet53_vaiq_int8 
    - onnx/models/DenseNet201_vaiq_int8 
    - onnx/models/EfficientNet_v2_s_vaiq_int8 
    - onnx/models/GoogLeNet_vaiq_int8 
    - onnx/models/Inception_v4_vaiq_int8 
    - onnx/models/LRASPP_vaiq_int8 
    - onnx/models/MNASNet_1_3_vaiq_int8 
    - onnx/models/MobileNetV3_small_vaiq_int8 
    - onnx/models/RDN_pytorch_vaiq_int8 
    - onnx/models/RegNet_y_8gf_vaiq_int8 
    - onnx/models/ResNet152_vaiq_int8 
    - onnx/models/retinanet_resnet50_fpn_vaiq_int8 
    - onnx/models/RRDB_ESRGAN_vaiq_int8 
    - onnx/models/ShuffleNet_v2_x2_0_vaiq_int8 
    - onnx/models/SqueezeNet_1_1_vaiq_int8 
    - onnx/models/VGG11_bn_vaiq_int8 
    - onnx/models/VGG19_vaiq_int8 
    - onnx/models/VideoResNet_vaiq_int8 
    - onnx/models/WideResNet_50_2_vaiq_int8 
    - onnx/models/YoloNetV3_vaiq_int8 
    - onnx/models/yolov8n_vaiq_int8 
    - onnx/models/u-net_brain_mri_vaiq_int8 
   
   More on the way :)
Pass all tests with llvm/torch-mlir#3134

```
| tests                              | model-run   | onnx-import   | torch-mlir   | iree-compile   | inference   |
|:-----------------------------------|:------------|:--------------|:-------------|:---------------|:------------|
| onnx/operators/ReduceProdKeepdims1 | passed      | passed        | passed       | passed         | passed      |
|:-----------------------------------|:------------|:--------------|:-------------|:---------------|:------------|
| onnx/operators/ReduceProdKeepdims0 | passed      | passed        | passed       | passed         | passed      |
```
This is to test some work I'm doing on quantized batch matmul in
torch-mlir to get Opt-125m to lower to linalg.

For now, I'm focusing on the one case I'm seeing in opt-125m which is
rank3 x rank2 -> rank3 MatMulInteger ops.
@saienduri saienduri merged commit 5a73d7e into e2eshark_ci Apr 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants