[Codegen][GPU] Let integer range optimization narrow GPU computations to i32 #19473

krzysz00 · 2024-12-12T01:08:10Z

Note: This PR is stacked on top of #19372, and so looks bigger than it is. The relevant changes are in the last commit.

Add an option to -iree-util-optimize-int-arithmetic to have it perform computations in i32 where possible, which is enabled when optimizing arithmetic for GPU codegen. This allows LLVM co correctly conclude that various computations don't need to be done at full 64-bit precision, thus saving registers and instructions. (LLVM has some rewrites for this, but they're, for example, gated on only having one use of the potentially-truncated value, which means that shared math stays in an over-wide data type).

compiler/src/iree/compiler/Dialect/Util/Transforms/Passes.td

compiler/src/iree/compiler/Codegen/Transforms/Transforms.cpp

compiler/src/iree/compiler/Dialect/Util/Transforms/Passes.td

compiler/src/iree/compiler/Dialect/Util/Transforms/OptimizeIntArithmetic.cpp

tests/external/iree-test-suites/onnx_ops/onnx_ops_gpu_vulkan.json

MaheshRavishankar

LGTM!

Groverkss · 2025-01-17T17:08:55Z

This pass goes into an infinite loop sometimes

Repro: https://gist.github.com/Groverkss/33313b03fd6cb600553ef511f9e7c6a5

iree-opt --iree-util-optimize-int-arithmetic='narrow-to-i32=true'

benvanik reviewed Dec 12, 2024

View reviewed changes

compiler/src/iree/compiler/Dialect/Util/Transforms/Passes.td Outdated Show resolved Hide resolved

krzysz00 force-pushed the index-narrowing branch 3 times, most recently from be116ef to ffa5fc4 Compare December 13, 2024 20:21

krzysz00 marked this pull request as ready for review December 17, 2024 17:28

krzysz00 requested review from antiagainst, MaheshRavishankar, kuhar, qedawkins and Groverkss as code owners December 17, 2024 17:28

krzysz00 force-pushed the index-narrowing branch 2 times, most recently from 01736ba to a367857 Compare January 6, 2025 21:47

qedawkins reviewed Jan 7, 2025

View reviewed changes

krzysz00 force-pushed the index-narrowing branch from a367857 to 09d7c2d Compare January 7, 2025 22:26

krzysz00 requested review from qedawkins and benvanik January 9, 2025 18:40

krzysz00 added 8 commits January 11, 2025 00:34

Let integer range optimizations narrow to i32

d183581

Update tests

e3121c9

Add pattern for narrowing assume.int with i32 inputs

28bf60f

Remove stray debug print

8078928

Narrow for loops too

faf0837

Review comments

ff804b7

I incidentally make a bunch of averagepool tests pass

71ccecc

Review comment - add builder to util.assume.int

5edb60b

MaheshRavishankar reviewed Jan 13, 2025

View reviewed changes

tests/external/iree-test-suites/onnx_ops/onnx_ops_gpu_vulkan.json Show resolved Hide resolved

MaheshRavishankar approved these changes Jan 13, 2025

View reviewed changes

Use correct builder

272e21f

krzysz00 force-pushed the index-narrowing branch from 881a314 to 272e21f Compare January 13, 2025 17:35

krzysz00 merged commit 2452b22 into iree-org:main Jan 13, 2025
37 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Codegen][GPU] Let integer range optimization narrow GPU computations to i32 #19473

[Codegen][GPU] Let integer range optimization narrow GPU computations to i32 #19473

krzysz00 commented Dec 12, 2024

MaheshRavishankar left a comment

Groverkss commented Jan 17, 2025

[Codegen][GPU] Let integer range optimization narrow GPU computations to i32 #19473

[Codegen][GPU] Let integer range optimization narrow GPU computations to i32 #19473

Conversation

krzysz00 commented Dec 12, 2024

MaheshRavishankar left a comment

Choose a reason for hiding this comment

Groverkss commented Jan 17, 2025