[RISCV][VLOPT] Enable the RISCVVLOptimizer by default #119461

michaelmaitland · 2024-12-10T22:07:43Z

Now that we have testing of all instructions in the isSupportedInstr switch, and better coverage of getOperandInfo, I think it is a good time to enable this by default.

llvmbot · 2024-12-10T22:08:08Z

@llvm/pr-subscribers-backend-risc-v

Author: Michael Maitland (michaelmaitland)

Changes

Now that we have testing of all instructions in the isSupportedInstr switch, and better coverage of getOperandInfo, I think it is a good time to enable this by default.

I'd like for #112231 and #119416 to land before this patch, so it'd be great for anyone reviewing this to check those out first.

Patch is 81.90 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/119461.diff

34 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVTargetMachine.cpp (+1-1)
(modified) llvm/test/CodeGen/RISCV/O3-pipeline.ll (+2-1)
(modified) llvm/test/CodeGen/RISCV/rvv/ctlz-vp.ll (+2-4)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-abs.ll (-2)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp.ll (+4-16)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-buildvec.ll (+2-1)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-shuffles.ll (+6-3)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int.ll (+3-1)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-changes-length.ll (+2-1)
(modified) llvm/test/CodeGen/RISCV/rvv/vdiv-vp.ll (-2)
(modified) llvm/test/CodeGen/RISCV/rvv/vdivu-vp.ll (+1-2)
(modified) llvm/test/CodeGen/RISCV/rvv/vfma-vp.ll (+72-76)
(modified) llvm/test/CodeGen/RISCV/rvv/vl-opt-op-info.ll (+16-30)
(modified) llvm/test/CodeGen/RISCV/rvv/vl-opt.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/rvv/vmax-vp.ll (-2)
(modified) llvm/test/CodeGen/RISCV/rvv/vmaxu-vp.ll (+1-2)
(modified) llvm/test/CodeGen/RISCV/rvv/vmin-vp.ll (-2)
(modified) llvm/test/CodeGen/RISCV/rvv/vminu-vp.ll (+1-2)
(modified) llvm/test/CodeGen/RISCV/rvv/vmul-vp.ll (+3-4)
(modified) llvm/test/CodeGen/RISCV/rvv/vrem-vp.ll (-2)
(modified) llvm/test/CodeGen/RISCV/rvv/vremu-vp.ll (+1-2)
(modified) llvm/test/CodeGen/RISCV/rvv/vsadd-vp.ll (+1-2)
(modified) llvm/test/CodeGen/RISCV/rvv/vsaddu-vp.ll (+1-2)
(modified) llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll (-3)
(modified) llvm/test/CodeGen/RISCV/rvv/vshl-vp.ll (+1-2)
(modified) llvm/test/CodeGen/RISCV/rvv/vsitofp-vp.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/rvv/vsra-sdnode.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/rvv/vsra-vp.ll (-2)
(modified) llvm/test/CodeGen/RISCV/rvv/vsrl-vp.ll (+1-2)
(modified) llvm/test/CodeGen/RISCV/rvv/vssub-vp.ll (+2-4)
(modified) llvm/test/CodeGen/RISCV/rvv/vssubu-vp.ll (+2-4)
(modified) llvm/test/CodeGen/RISCV/rvv/vuitofp-vp.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/rvv/vwsll-vp.ll (+30-60)
(modified) llvm/test/CodeGen/RISCV/srem-seteq-illegal-types.ll (+4-3)

diff --git a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
index dcd3598f658f6a..c507ab3f4f3885 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
@@ -112,7 +112,7 @@ static cl::opt<bool> EnablePostMISchedLoadStoreClustering(
 static cl::opt<bool>
     EnableVLOptimizer("riscv-enable-vl-optimizer",
                       cl::desc("Enable the RISC-V VL Optimizer pass"),
-                      cl::init(false), cl::Hidden);
+                      cl::init(true), cl::Hidden);
 
 static cl::opt<bool> DisableVectorMaskMutation(
     "riscv-disable-vector-mask-mutation",
diff --git a/llvm/test/CodeGen/RISCV/O3-pipeline.ll b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
index 8fd9ae98503665..b0c756e26985bb 100644
--- a/llvm/test/CodeGen/RISCV/O3-pipeline.ll
+++ b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
@@ -119,6 +119,8 @@
 ; RV64-NEXT:        RISC-V Optimize W Instructions
 ; CHECK-NEXT:       RISC-V Pre-RA pseudo instruction expansion pass
 ; CHECK-NEXT:       RISC-V Merge Base Offset
+; CHECK-NEXT:       MachineDominator Tree Construction
+; CHECK-NEXT:       RISC-V VL Optimizer
 ; CHECK-NEXT:       RISC-V Insert Read/Write CSR Pass
 ; CHECK-NEXT:       RISC-V Insert Write VXRM Pass
 ; CHECK-NEXT:       RISC-V Landing Pad Setup
@@ -129,7 +131,6 @@
 ; CHECK-NEXT:       Live Variable Analysis
 ; CHECK-NEXT:       Eliminate PHI nodes for register allocation
 ; CHECK-NEXT:       Two-Address instruction pass
-; CHECK-NEXT:       MachineDominator Tree Construction
 ; CHECK-NEXT:       Slot index numbering
 ; CHECK-NEXT:       Live Interval Analysis
 ; CHECK-NEXT:       Register Coalescer
diff --git a/llvm/test/CodeGen/RISCV/rvv/ctlz-vp.ll b/llvm/test/CodeGen/RISCV/rvv/ctlz-vp.ll
index ce4bc48dff0426..6f515996677ee6 100644
--- a/llvm/test/CodeGen/RISCV/rvv/ctlz-vp.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/ctlz-vp.ll
@@ -2654,9 +2654,8 @@ define <vscale x 1 x i9> @vp_ctlo_zero_undef_nxv1i9(<vscale x 1 x i9> %va, <vsca
 ; CHECK-LABEL: vp_ctlo_zero_undef_nxv1i9:
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    li a1, 511
-; CHECK-NEXT:    vsetvli a2, zero, e16, mf4, ta, ma
-; CHECK-NEXT:    vxor.vx v8, v8, a1
 ; CHECK-NEXT:    vsetvli zero, a0, e16, mf4, ta, ma
+; CHECK-NEXT:    vxor.vx v8, v8, a1
 ; CHECK-NEXT:    vsll.vi v8, v8, 7, v0.t
 ; CHECK-NEXT:    vfwcvt.f.xu.v v9, v8, v0.t
 ; CHECK-NEXT:    vsetvli zero, zero, e32, mf2, ta, ma
@@ -2670,9 +2669,8 @@ define <vscale x 1 x i9> @vp_ctlo_zero_undef_nxv1i9(<vscale x 1 x i9> %va, <vsca
 ; CHECK-ZVBB-LABEL: vp_ctlo_zero_undef_nxv1i9:
 ; CHECK-ZVBB:       # %bb.0:
 ; CHECK-ZVBB-NEXT:    li a1, 511
-; CHECK-ZVBB-NEXT:    vsetvli a2, zero, e16, mf4, ta, ma
-; CHECK-ZVBB-NEXT:    vxor.vx v8, v8, a1
 ; CHECK-ZVBB-NEXT:    vsetvli zero, a0, e16, mf4, ta, ma
+; CHECK-ZVBB-NEXT:    vxor.vx v8, v8, a1
 ; CHECK-ZVBB-NEXT:    vsll.vi v8, v8, 7, v0.t
 ; CHECK-ZVBB-NEXT:    vclz.v v8, v8, v0.t
 ; CHECK-ZVBB-NEXT:    ret
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-abs.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-abs.ll
index ac7d3d9109e39c..3153b44386d7ae 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-abs.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-abs.ll
@@ -39,9 +39,7 @@ define void @abs_v6i16(ptr %x) {
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    vsetivli zero, 6, e16, m1, ta, ma
 ; CHECK-NEXT:    vle16.v v8, (a0)
-; CHECK-NEXT:    vsetivli zero, 8, e16, m1, ta, ma
 ; CHECK-NEXT:    vrsub.vi v9, v8, 0
-; CHECK-NEXT:    vsetivli zero, 6, e16, m1, ta, ma
 ; CHECK-NEXT:    vmax.vv v8, v8, v9
 ; CHECK-NEXT:    vse16.v v8, (a0)
 ; CHECK-NEXT:    ret
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp.ll
index 36bbec12e9b06c..15793eaada0783 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp.ll
@@ -788,11 +788,9 @@ define void @copysign_v6bf16(ptr %x, ptr %y) {
 ; CHECK-NEXT:    vle16.v v8, (a1)
 ; CHECK-NEXT:    vle16.v v9, (a0)
 ; CHECK-NEXT:    lui a1, 8
-; CHECK-NEXT:    vsetivli zero, 8, e16, m1, ta, ma
 ; CHECK-NEXT:    vand.vx v8, v8, a1
 ; CHECK-NEXT:    addi a1, a1, -1
 ; CHECK-NEXT:    vand.vx v9, v9, a1
-; CHECK-NEXT:    vsetivli zero, 6, e16, m1, ta, ma
 ; CHECK-NEXT:    vor.vv v8, v9, v8
 ; CHECK-NEXT:    vse16.v v8, (a0)
 ; CHECK-NEXT:    ret
@@ -848,11 +846,9 @@ define void @copysign_v6f16(ptr %x, ptr %y) {
 ; ZVFHMIN-NEXT:    vle16.v v8, (a1)
 ; ZVFHMIN-NEXT:    vle16.v v9, (a0)
 ; ZVFHMIN-NEXT:    lui a1, 8
-; ZVFHMIN-NEXT:    vsetivli zero, 8, e16, m1, ta, ma
 ; ZVFHMIN-NEXT:    vand.vx v8, v8, a1
 ; ZVFHMIN-NEXT:    addi a1, a1, -1
 ; ZVFHMIN-NEXT:    vand.vx v9, v9, a1
-; ZVFHMIN-NEXT:    vsetivli zero, 6, e16, m1, ta, ma
 ; ZVFHMIN-NEXT:    vor.vv v8, v9, v8
 ; ZVFHMIN-NEXT:    vse16.v v8, (a0)
 ; ZVFHMIN-NEXT:    ret
@@ -924,12 +920,10 @@ define void @copysign_vf_v6bf16(ptr %x, bfloat %y) {
 ; CHECK-NEXT:    vsetivli zero, 6, e16, m1, ta, ma
 ; CHECK-NEXT:    vle16.v v8, (a0)
 ; CHECK-NEXT:    lui a2, 8
-; CHECK-NEXT:    vsetivli zero, 8, e16, m1, ta, ma
 ; CHECK-NEXT:    vmv.v.x v9, a1
 ; CHECK-NEXT:    addi a1, a2, -1
 ; CHECK-NEXT:    vand.vx v8, v8, a1
 ; CHECK-NEXT:    vand.vx v9, v9, a2
-; CHECK-NEXT:    vsetivli zero, 6, e16, m1, ta, ma
 ; CHECK-NEXT:    vor.vv v8, v8, v9
 ; CHECK-NEXT:    vse16.v v8, (a0)
 ; CHECK-NEXT:    ret
@@ -986,12 +980,10 @@ define void @copysign_vf_v6f16(ptr %x, half %y) {
 ; ZVFHMIN-NEXT:    vsetivli zero, 6, e16, m1, ta, ma
 ; ZVFHMIN-NEXT:    vle16.v v8, (a0)
 ; ZVFHMIN-NEXT:    lui a2, 8
-; ZVFHMIN-NEXT:    vsetivli zero, 8, e16, m1, ta, ma
 ; ZVFHMIN-NEXT:    vmv.v.x v9, a1
 ; ZVFHMIN-NEXT:    addi a1, a2, -1
 ; ZVFHMIN-NEXT:    vand.vx v8, v8, a1
 ; ZVFHMIN-NEXT:    vand.vx v9, v9, a2
-; ZVFHMIN-NEXT:    vsetivli zero, 6, e16, m1, ta, ma
 ; ZVFHMIN-NEXT:    vor.vv v8, v8, v9
 ; ZVFHMIN-NEXT:    vse16.v v8, (a0)
 ; ZVFHMIN-NEXT:    ret
@@ -1065,11 +1057,9 @@ define void @copysign_neg_v6bf16(ptr %x, ptr %y) {
 ; CHECK-NEXT:    vle16.v v9, (a0)
 ; CHECK-NEXT:    lui a1, 8
 ; CHECK-NEXT:    addi a2, a1, -1
-; CHECK-NEXT:    vsetivli zero, 8, e16, m1, ta, ma
 ; CHECK-NEXT:    vxor.vx v8, v8, a1
 ; CHECK-NEXT:    vand.vx v9, v9, a2
 ; CHECK-NEXT:    vand.vx v8, v8, a1
-; CHECK-NEXT:    vsetivli zero, 6, e16, m1, ta, ma
 ; CHECK-NEXT:    vor.vv v8, v9, v8
 ; CHECK-NEXT:    vse16.v v8, (a0)
 ; CHECK-NEXT:    ret
@@ -1129,11 +1119,9 @@ define void @copysign_neg_v6f16(ptr %x, ptr %y) {
 ; ZVFHMIN-NEXT:    vle16.v v9, (a0)
 ; ZVFHMIN-NEXT:    lui a1, 8
 ; ZVFHMIN-NEXT:    addi a2, a1, -1
-; ZVFHMIN-NEXT:    vsetivli zero, 8, e16, m1, ta, ma
 ; ZVFHMIN-NEXT:    vxor.vx v8, v8, a1
 ; ZVFHMIN-NEXT:    vand.vx v9, v9, a2
 ; ZVFHMIN-NEXT:    vand.vx v8, v8, a1
-; ZVFHMIN-NEXT:    vsetivli zero, 6, e16, m1, ta, ma
 ; ZVFHMIN-NEXT:    vor.vv v8, v9, v8
 ; ZVFHMIN-NEXT:    vse16.v v8, (a0)
 ; ZVFHMIN-NEXT:    ret
@@ -1211,12 +1199,12 @@ define void @copysign_neg_trunc_v3bf16_v3f32(ptr %x, ptr %y) {
 ; CHECK-NEXT:    vle32.v v9, (a1)
 ; CHECK-NEXT:    lui a1, 8
 ; CHECK-NEXT:    addi a2, a1, -1
-; CHECK-NEXT:    vsetivli zero, 4, e16, mf2, ta, ma
 ; CHECK-NEXT:    vand.vx v8, v8, a2
+; CHECK-NEXT:    vsetivli zero, 4, e16, mf2, ta, ma
 ; CHECK-NEXT:    vfncvtbf16.f.f.w v10, v9
+; CHECK-NEXT:    vsetivli zero, 3, e16, mf2, ta, ma
 ; CHECK-NEXT:    vxor.vx v9, v10, a1
 ; CHECK-NEXT:    vand.vx v9, v9, a1
-; CHECK-NEXT:    vsetivli zero, 3, e16, mf2, ta, ma
 ; CHECK-NEXT:    vor.vv v8, v8, v9
 ; CHECK-NEXT:    vse16.v v8, (a0)
 ; CHECK-NEXT:    ret
@@ -1283,12 +1271,12 @@ define void @copysign_neg_trunc_v3f16_v3f32(ptr %x, ptr %y) {
 ; ZVFHMIN-NEXT:    vle32.v v9, (a1)
 ; ZVFHMIN-NEXT:    lui a1, 8
 ; ZVFHMIN-NEXT:    addi a2, a1, -1
-; ZVFHMIN-NEXT:    vsetivli zero, 4, e16, mf2, ta, ma
 ; ZVFHMIN-NEXT:    vand.vx v8, v8, a2
+; ZVFHMIN-NEXT:    vsetivli zero, 4, e16, mf2, ta, ma
 ; ZVFHMIN-NEXT:    vfncvt.f.f.w v10, v9
+; ZVFHMIN-NEXT:    vsetivli zero, 3, e16, mf2, ta, ma
 ; ZVFHMIN-NEXT:    vxor.vx v9, v10, a1
 ; ZVFHMIN-NEXT:    vand.vx v9, v9, a1
-; ZVFHMIN-NEXT:    vsetivli zero, 3, e16, mf2, ta, ma
 ; ZVFHMIN-NEXT:    vor.vv v8, v8, v9
 ; ZVFHMIN-NEXT:    vse16.v v8, (a0)
 ; ZVFHMIN-NEXT:    ret
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-buildvec.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-buildvec.ll
index e9fd0a19e3eb66..276b5401a902a4 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-buildvec.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-buildvec.ll
@@ -910,8 +910,9 @@ define <4 x i8> @buildvec_not_vid_v4i8_2() {
 define <16 x i8> @buildvec_not_vid_v16i8() {
 ; CHECK-LABEL: buildvec_not_vid_v16i8:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetivli zero, 16, e8, m1, ta, ma
+; CHECK-NEXT:    vsetivli zero, 7, e8, m1, ta, ma
 ; CHECK-NEXT:    vmv.v.i v9, 3
+; CHECK-NEXT:    vsetivli zero, 16, e8, m1, ta, ma
 ; CHECK-NEXT:    vmv.v.i v8, 0
 ; CHECK-NEXT:    vsetivli zero, 7, e8, m1, tu, ma
 ; CHECK-NEXT:    vslideup.vi v8, v9, 6
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-shuffles.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-shuffles.ll
index 1c6e1a37fa8af5..a8e12dfaa82e9c 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-shuffles.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-shuffles.ll
@@ -348,8 +348,9 @@ define <8 x i8> @splat_ve4_ins_i0ve2(<8 x i8> %v) {
 define <8 x i8> @splat_ve4_ins_i1ve3(<8 x i8> %v) {
 ; CHECK-LABEL: splat_ve4_ins_i1ve3:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetivli zero, 8, e8, mf2, ta, ma
+; CHECK-NEXT:    vsetivli zero, 2, e8, mf2, ta, ma
 ; CHECK-NEXT:    vmv.v.i v9, 3
+; CHECK-NEXT:    vsetivli zero, 8, e8, mf2, ta, ma
 ; CHECK-NEXT:    vmv.v.i v10, 4
 ; CHECK-NEXT:    vsetivli zero, 2, e8, mf2, tu, ma
 ; CHECK-NEXT:    vslideup.vi v10, v9, 1
@@ -432,8 +433,9 @@ define <8 x i8> @splat_ve2_we0_ins_i2ve4(<8 x i8> %v, <8 x i8> %w) {
 define <8 x i8> @splat_ve2_we0_ins_i2we4(<8 x i8> %v, <8 x i8> %w) {
 ; CHECK-LABEL: splat_ve2_we0_ins_i2we4:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetivli zero, 8, e8, mf2, ta, ma
+; CHECK-NEXT:    vsetivli zero, 3, e8, mf2, ta, ma
 ; CHECK-NEXT:    vmv.v.i v10, 4
+; CHECK-NEXT:    vsetivli zero, 8, e8, mf2, ta, ma
 ; CHECK-NEXT:    vmv.v.i v11, 0
 ; CHECK-NEXT:    li a0, 70
 ; CHECK-NEXT:    vsetivli zero, 3, e8, mf2, tu, ma
@@ -451,8 +453,9 @@ define <8 x i8> @splat_ve2_we0_ins_i2we4(<8 x i8> %v, <8 x i8> %w) {
 define <8 x i8> @splat_ve2_we0_ins_i2ve4_i5we6(<8 x i8> %v, <8 x i8> %w) {
 ; CHECK-LABEL: splat_ve2_we0_ins_i2ve4_i5we6:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetivli zero, 8, e8, mf2, ta, ma
+; CHECK-NEXT:    vsetivli zero, 6, e8, mf2, ta, ma
 ; CHECK-NEXT:    vmv.v.i v10, 6
+; CHECK-NEXT:    vsetivli zero, 8, e8, mf2, ta, ma
 ; CHECK-NEXT:    vmv.v.i v11, 0
 ; CHECK-NEXT:    lui a0, 8256
 ; CHECK-NEXT:    addi a0, a0, 2
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int.ll
index cba8de82ec41b9..59c7feb53ce94e 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int.ll
@@ -1100,15 +1100,17 @@ define void @mulhu_v8i16(ptr %x) {
 ; CHECK-NEXT:    vsetivli zero, 8, e16, m1, ta, ma
 ; CHECK-NEXT:    vle16.v v8, (a0)
 ; CHECK-NEXT:    vmv.v.i v9, 0
+; CHECK-NEXT:    vsetivli zero, 7, e16, m1, ta, ma
 ; CHECK-NEXT:    vmv.v.i v10, 1
 ; CHECK-NEXT:    li a1, 33
 ; CHECK-NEXT:    vmv.s.x v0, a1
 ; CHECK-NEXT:    lui a1, %hi(.LCPI66_0)
 ; CHECK-NEXT:    addi a1, a1, %lo(.LCPI66_0)
+; CHECK-NEXT:    vsetivli zero, 8, e16, m1, ta, ma
 ; CHECK-NEXT:    vmv.v.i v11, 3
 ; CHECK-NEXT:    vle16.v v12, (a1)
 ; CHECK-NEXT:    vmerge.vim v11, v11, 2, v0
-; CHECK-NEXT:    vmv.v.i v13, 0
+; CHECK-NEXT:    vmv1r.v v13, v9
 ; CHECK-NEXT:    vsetivli zero, 7, e16, m1, tu, ma
 ; CHECK-NEXT:    vslideup.vi v9, v10, 6
 ; CHECK-NEXT:    vsetivli zero, 8, e16, m1, ta, ma
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-changes-length.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-changes-length.ll
index 66f95b70776720..abbbfe8f252fb2 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-changes-length.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-changes-length.ll
@@ -97,8 +97,9 @@ define <4 x i32> @v4i32_v8i32(<8 x i32>) {
 define <4 x i32> @v4i32_v16i32(<16 x i32>) {
 ; RV32-LABEL: v4i32_v16i32:
 ; RV32:       # %bb.0:
-; RV32-NEXT:    vsetivli zero, 8, e16, m1, ta, ma
+; RV32-NEXT:    vsetivli zero, 2, e16, m1, ta, ma
 ; RV32-NEXT:    vmv.v.i v12, 1
+; RV32-NEXT:    vsetivli zero, 8, e16, m1, ta, ma
 ; RV32-NEXT:    vmv.v.i v14, 6
 ; RV32-NEXT:    li a0, 32
 ; RV32-NEXT:    vmv.v.i v0, 10
diff --git a/llvm/test/CodeGen/RISCV/rvv/vdiv-vp.ll b/llvm/test/CodeGen/RISCV/rvv/vdiv-vp.ll
index c7b5200979370e..2814be2792de9a 100644
--- a/llvm/test/CodeGen/RISCV/rvv/vdiv-vp.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/vdiv-vp.ll
@@ -11,9 +11,7 @@ define <vscale x 8 x i7> @vdiv_vx_nxv8i7(<vscale x 8 x i7> %a, i7 signext %b, <v
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    vsetvli zero, a1, e8, m1, ta, ma
 ; CHECK-NEXT:    vsll.vi v8, v8, 1, v0.t
-; CHECK-NEXT:    vsetvli a2, zero, e8, m1, ta, ma
 ; CHECK-NEXT:    vmv.v.x v9, a0
-; CHECK-NEXT:    vsetvli zero, a1, e8, m1, ta, ma
 ; CHECK-NEXT:    vsra.vi v8, v8, 1, v0.t
 ; CHECK-NEXT:    vsll.vi v9, v9, 1, v0.t
 ; CHECK-NEXT:    vsra.vi v9, v9, 1, v0.t
diff --git a/llvm/test/CodeGen/RISCV/rvv/vdivu-vp.ll b/llvm/test/CodeGen/RISCV/rvv/vdivu-vp.ll
index 850ad863dd384e..3e913d4f682ed4 100644
--- a/llvm/test/CodeGen/RISCV/rvv/vdivu-vp.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/vdivu-vp.ll
@@ -10,9 +10,8 @@ define <vscale x 8 x i7> @vdivu_vx_nxv8i7(<vscale x 8 x i7> %a, i7 signext %b, <
 ; CHECK-LABEL: vdivu_vx_nxv8i7:
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    li a2, 127
-; CHECK-NEXT:    vsetvli a3, zero, e8, m1, ta, ma
-; CHECK-NEXT:    vmv.v.x v9, a0
 ; CHECK-NEXT:    vsetvli zero, a1, e8, m1, ta, ma
+; CHECK-NEXT:    vmv.v.x v9, a0
 ; CHECK-NEXT:    vand.vx v8, v8, a2, v0.t
 ; CHECK-NEXT:    vand.vx v9, v9, a2, v0.t
 ; CHECK-NEXT:    vdivu.vv v8, v8, v9, v0.t
diff --git a/llvm/test/CodeGen/RISCV/rvv/vfma-vp.ll b/llvm/test/CodeGen/RISCV/rvv/vfma-vp.ll
index 7ca1983e8b32c0..ab67e9833c78aa 100644
--- a/llvm/test/CodeGen/RISCV/rvv/vfma-vp.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/vfma-vp.ll
@@ -4301,10 +4301,9 @@ define <vscale x 1 x half> @vfnmadd_vf_nxv1f16_neg_splat(<vscale x 1 x half> %va
 ; ZVFHMIN-LABEL: vfnmadd_vf_nxv1f16_neg_splat:
 ; ZVFHMIN:       # %bb.0:
 ; ZVFHMIN-NEXT:    fmv.x.h a1, fa0
-; ZVFHMIN-NEXT:    vsetvli a2, zero, e16, mf4, ta, ma
+; ZVFHMIN-NEXT:    vsetvli zero, a0, e16, mf4, ta, ma
 ; ZVFHMIN-NEXT:    vmv.v.x v10, a1
 ; ZVFHMIN-NEXT:    lui a1, 8
-; ZVFHMIN-NEXT:    vsetvli zero, a0, e16, mf4, ta, ma
 ; ZVFHMIN-NEXT:    vxor.vx v10, v10, a1, v0.t
 ; ZVFHMIN-NEXT:    vxor.vx v9, v9, a1, v0.t
 ; ZVFHMIN-NEXT:    vsetvli a1, zero, e16, mf4, ta, ma
@@ -4334,10 +4333,9 @@ define <vscale x 1 x half> @vfnmadd_vf_nxv1f16_neg_splat_commute(<vscale x 1 x h
 ; ZVFHMIN-LABEL: vfnmadd_vf_nxv1f16_neg_splat_commute:
 ; ZVFHMIN:       # %bb.0:
 ; ZVFHMIN-NEXT:    fmv.x.h a1, fa0
-; ZVFHMIN-NEXT:    vsetvli a2, zero, e16, mf4, ta, ma
+; ZVFHMIN-NEXT:    vsetvli zero, a0, e16, mf4, ta, ma
 ; ZVFHMIN-NEXT:    vmv.v.x v10, a1
 ; ZVFHMIN-NEXT:    lui a1, 8
-; ZVFHMIN-NEXT:    vsetvli zero, a0, e16, mf4, ta, ma
 ; ZVFHMIN-NEXT:    vxor.vx v10, v10, a1, v0.t
 ; ZVFHMIN-NEXT:    vxor.vx v9, v9, a1, v0.t
 ; ZVFHMIN-NEXT:    vsetvli a1, zero, e16, mf4, ta, ma
@@ -4367,10 +4365,9 @@ define <vscale x 1 x half> @vfnmadd_vf_nxv1f16_neg_splat_unmasked(<vscale x 1 x
 ; ZVFHMIN-LABEL: vfnmadd_vf_nxv1f16_neg_splat_unmasked:
 ; ZVFHMIN:       # %bb.0:
 ; ZVFHMIN-NEXT:    fmv.x.h a1, fa0
-; ZVFHMIN-NEXT:    vsetvli a2, zero, e16, mf4, ta, ma
+; ZVFHMIN-NEXT:    vsetvli zero, a0, e16, mf4, ta, ma
 ; ZVFHMIN-NEXT:    vmv.v.x v10, a1
 ; ZVFHMIN-NEXT:    lui a1, 8
-; ZVFHMIN-NEXT:    vsetvli zero, a0, e16, mf4, ta, ma
 ; ZVFHMIN-NEXT:    vxor.vx v9, v9, a1
 ; ZVFHMIN-NEXT:    vxor.vx v10, v10, a1
 ; ZVFHMIN-NEXT:    vsetvli a1, zero, e16, mf4, ta, ma
@@ -4400,10 +4397,9 @@ define <vscale x 1 x half> @vfnmadd_vf_nxv1f16_neg_splat_unmasked_commute(<vscal
 ; ZVFHMIN-LABEL: vfnmadd_vf_nxv1f16_neg_splat_unmasked_commute:
 ; ZVFHMIN:       # %bb.0:
 ; ZVFHMIN-NEXT:    fmv.x.h a1, fa0
-; ZVFHMIN-NEXT:    vsetvli a2, zero, e16, mf4, ta, ma
+; ZVFHMIN-NEXT:    vsetvli zero, a0, e16, mf4, ta, ma
 ; ZVFHMIN-NEXT:    vmv.v.x v10, a1
 ; ZVFHMIN-NEXT:    lui a1, 8
-; ZVFHMIN-NEXT:    vsetvli zero, a0, e16, mf4, ta, ma
 ; ZVFHMIN-NEXT:    vxor.vx v9, v9, a1
 ; ZVFHMIN-NEXT:    vxor.vx v10, v10, a1
 ; ZVFHMIN-NEXT:    vsetvli a1, zero, e16, mf4, ta, ma
@@ -4670,9 +4666,10 @@ define <vscale x 1 x half> @vfnmsub_vf_nxv1f16_neg_splat(<vscale x 1 x half> %va
 ; ZVFHMIN-LABEL: vfnmsub_vf_nxv1f16_neg_splat:
 ; ZVFHMIN:       # %bb.0:
 ; ZVFHMIN-NEXT:    fmv.x.h a1, fa0
-; ZVFHMIN-NEXT:    vsetvli a2, zero, e16, mf4, ta, ma
+; ZVFHMIN-NEXT:    vsetvli zero, a0, e16, mf4, ta, ma
 ; ZVFHMIN-NEXT:    vmv.v.x v10, a1
 ; ZVFHMIN-NEXT:    lui a1, 8
+; ZVFHMIN-NEXT:    vsetvli a2, zero, e16, mf4, ta, ma
 ; ZVFHMIN-NEXT:    vfwcvt.f.f.v v11, v9
 ; ZVFHMIN-NEXT:    vsetvli zero, a0, e16, mf4, ta, ma
 ; ZVFHMIN-NEXT:    vxor.vx v9, v10, a1, v0.t
@@ -4701,9 +4698,10 @@ define <vscale x 1 x half> @vfnmsub_vf_nxv1f16_neg_splat_commute(<vscale x 1 x h
 ; ZVFHMIN-LABEL: vfnmsub_vf_nxv1f16_neg_splat_commute:
 ; ZVFHMIN:       # %bb.0:
 ; ZVFHMIN-NEXT:    fmv.x.h a1, fa0
-; ZVFHMIN-NEXT:    vsetvli a2, zero, e16, mf4, ta, ma
+; ZVFHMIN-NEXT:    vsetvli zero, a0, e16, mf4, ta, ma
 ; ZVFHMIN-NEXT:    vmv.v.x v10, a1
 ; ZVFHMIN-NEXT:    lui a1, 8
+; ZVFHMIN-NEXT:    vsetvli a2, zero, e16, mf4, ta, ma
 ; ZVFHMIN-NEXT:    vfwcvt.f.f.v v11, v9
 ; ZVFHMIN-NEXT:    vsetvli zero, a0, e16, mf4, ta, ma
 ; ZVFHMIN-NEXT:    vxor.vx v9, v10, a1, v0.t
@@ -4732,9 +4730,10 @@ define <vscale x 1 x half> @vfnmsub_vf_nxv1f16_neg_splat_unmasked(<vscale x 1 x
 ; ZVFHMIN-LABEL: vfnmsub_vf_nxv1f16_neg_splat_unmasked:
 ; ZVFHMIN:       # %bb.0:
 ; ZVFHMIN-NEXT:    fmv.x.h a1, fa0
-; ZVFHMIN-NEXT:    vsetvli a2, zero, e16, mf4, ta, ma
+; ZVFHMIN-NEXT:    vsetvli zero, a0, e16, mf4, ta, ma
 ; ZVFHMIN-NEXT:    vmv.v.x v10, a1
 ; ZVFHMIN-NEXT:    lui a1, 8
+; ZVFHMIN-NEXT:    vsetvli a2, zero, e16, mf4, ta, ma
 ; ZVFHMIN-NEXT:    vfwcvt.f.f.v v11, v9
 ; ZVFHMIN-NEXT:    vsetvli zero, a0, e16, mf4, ta, ma
 ; ZVFHMIN-NEXT:    vxor.vx v9, v10, a1
@@ -4763,9 +4762,10 @@ define <vscale x 1 x half> @vfnmsub_vf_nxv1f16_neg_splat_unmasked_commute(<vscal
 ; ZVFHMIN-LABEL: vfnmsub_vf_nxv1f16_neg_splat_unmasked_commute:
 ; ZVFHMIN:       # %bb.0:
 ; ZVFHMIN-NEXT:    fmv.x.h a1, fa0
-; ZVFHMIN-NEXT:    vsetvli a2, zero, e16, mf4, ta, ma
+; ZVFHMIN-NEXT:    vsetvli zero, a0, e16, mf4, ta, ma
 ; ZVFHMIN-NEXT:    vmv.v.x v10, a1
 ; ZVFHMIN-NEXT:    lui a1, 8
+; ZVFHMIN-NEXT:    vsetvli a2, zero, e16, mf4, ta, ma
 ; ZVFHMIN-NEXT:    vfwcvt.f.f.v v11, v9
 ; ZVFHMIN-NEXT:    vsetvli zero, a0, e16, mf4, ta, ma
 ; ZVFHMIN-NEXT:    vxor.vx v9, v10, a1
@@ -5220,10 +5220,9 @@ define <vscale x 2 x half> @vfnmadd_vf_nxv2f16_neg_splat(<vscale x 2 x half> %va
 ; ZVFHMIN-LABEL: vfnmadd_vf_nxv2f16_neg_splat:
 ; ZVFHMIN:       # %bb.0:
 ; ZVFHMIN-NEXT:    fmv.x.h a1, fa0
-; ZVFHMIN-NEXT:    vsetvli a2, zero, e16, mf2, ta, ma
+; ZVFHMIN-NEXT:    vsetvli zero, a0, e16, mf2, ta, ma
 ; ZVFHMIN-NEXT:    vmv.v.x v10, a1
 ; ZVFHMIN-NEXT:    lui a1, 8
-; ZVFHMIN-NEXT:    vsetvli zero, a0, e16, mf2, ta, ma
 ; ZVFHMIN-NEXT:    vxor.vx v10, v10, a1, v0.t
 ; ZVFHMIN-NEXT:    vxor.vx v9, v9, a1, v0.t
 ; ZVFHMIN-NEXT:    vsetvli a1, zero, e16, mf2, ta, ma
@@ -5253,10 +5252,9 @@ define <vscale x 2 x half> @vfnmadd_vf_nxv2f16_neg_splat_commute(<vscale x 2 x h
 ; ZVFHMIN-LABEL: vfnmadd_vf_nxv2f16_neg_splat_commute:
 ; ZVFHMIN:       # %bb.0:
 ; ZVFHMIN-NEXT:    fmv.x.h a1, fa0
-; ZVFHMIN-NEXT:    vsetvli a2, zero, e16, mf2, ta, ma
+; ZVFHMIN-NEXT:    vsetvli zero, a0, e16, mf2, ta, ma
 ; ZVFHMIN-NEXT:    vmv.v.x v10, a1
 ; ZVFHMIN-NEXT:    lui a1, 8
-; ZVFHMIN-NEXT:    vsetvli zero, a0, e16, mf2, ta, ma
 ; ZVFHMIN-NEXT:    vxor.vx v10, v10, a1, v0.t
 ; ZVFHMIN-NEXT:    vxor.vx v9, v9, a1, v0.t
 ; ZVFHMIN-NEXT:    vsetvli a1, zero, e16, mf2, ta, ma
@@ -5286,10 +5284,9 @@ define <vscale x 2 x half> @vfnmadd_vf_nxv2f16_neg_splat_unmasked(<vscale x 2 x
 ; ZVFHMIN-LABEL: vfnmadd_vf_nxv2f16_neg_splat_unmasked:
 ; ZVFHMI...
[truncated]

michaelmaitland · 2024-12-17T14:23:06Z

ping

Now that we have testing of all instructions in the isSupportedInstr switch, and better coverage of getOperandInfo, I think it is a good time to enable this by default. I'd like for llvm#112231 and llvm#119416 to land before this patch, so it'd be great for anyone reviewing this to check those out first.

lukel97 · 2024-12-17T16:08:40Z

Have you had a chance to kick the tires with this on llvm-test-suite or SPEC?

michaelmaitland · 2024-12-17T17:05:26Z

Have you had a chance to kick the tires with this on llvm-test-suite or SPEC?

I had a chance to run this on spec2006/int/train and spec2017/int/train on qemu. Both build and terminate without errors.

preames · 2024-12-17T17:32:11Z

Reading through the code, I spotted one potential correctness issue. This is a cornercase, but probably still worth fixing.

Imagine you have the following:
%v = VADD_VV ...
%s = VREDSUM w/ %v as scalar source
%dead = VADD_VV %v, %v w/ VL=0

The last instruction is dead - it can be folded to it's passthru. (In practice, it probably will have been folded, but it's possible something could slip through to here.) However, when scaning the users of %v, we will decide that the correct VL for %v is 0 (or a register which might be zero), and reduce it below the minimum VL=1 required by the reduction.

To fix this, I believe you need to treat the CommonVL for the scalar operand case as being VL=1. You could also track a non-zero state instead.

Other than that, looks good to me. Once you've fixed this issue, happy to approve.

lukel97 · 2024-12-17T17:53:37Z

Have you had a chance to kick the tires with this on llvm-test-suite or SPEC?

I had a chance to run this on spec2006/int/train and spec2017/int/train on qemu. Both build and terminate without errors.

Nice, this also just came to mind but did you run it with the rvv_ta_all_1s=1 option set? I'm thinking if there were any potential miscompiles then this would probably be needed to catch them

michaelmaitland · 2024-12-17T18:02:56Z

Have you had a chance to kick the tires with this on llvm-test-suite or SPEC?

I had a chance to run this on spec2006/int/train and spec2017/int/train on qemu. Both build and terminate without errors.

Nice, this also just came to mind but did you run it with the rvv_ta_all_1s=1 option set? I'm thinking if there were any potential miscompiles then this would probably be needed to catch them

Yes

topperc · 2024-12-17T18:14:57Z

Reading through the code, I spotted one potential correctness issue. This is a cornercase, but probably still worth fixing.

Imagine you have the following: %v = VADD_VV ... %s = VREDSUM w/ %v as scalar source %dead = VADD_VV %v, %v w/ VL=0

The last instruction is dead - it can be folded to it's passthru. (In practice, it probably will have been folded, but it's possible something could slip through to here.) However, when scaning the users of %v, we will decide that the correct VL for %v is 0 (or a register which might be zero), and reduce it below the minimum VL=1 required by the reduction.

To fix this, I believe you need to treat the CommonVL for the scalar operand case as being VL=1. You could also track a non-zero state instead.

Other than that, looks good to me. Once you've fixed this issue, happy to approve.

Should we just remove this code for now. There are no directed tests for it.

    // Instructions like reductions may use a vector register as a scalar
    // register. In this case, we should treat it like a scalar register which
    // does not impact the decision on whether to optimize VL.
    if (isVectorOpUsedAsScalarOp(UserOp)) {
      [[maybe_unused]] Register R = UserOp.getReg();
      [[maybe_unused]] const TargetRegisterClass *RC = MRI->getRegClass(R);
      assert(RISCV::VRRegClass.hasSubClassEq(RC) &&
             "Expect LMUL 1 register class for vector as scalar operands!");
      LLVM_DEBUG(dbgs() << "    Use this operand as a scalar operand\n");
      continue;
    }

preames · 2024-12-17T18:51:49Z

Should we just remove this code for now

This would be fine by me. Incrementalism is good. :)

michaelmaitland · 2024-12-17T20:13:46Z

Should we just remove this code for now

This would be fine by me. Incrementalism is good. :)

I've removed it in #120291.

FWIW, I don't think what you're concerned about can happen with or without #120291 merged since all the instructions that isVectorOpUsedAsScalarOp deal with return OperandInfo(Unknown) and won't lead to any (incorrect) optimization.

topperc · 2024-12-17T20:27:17Z

Should we just remove this code for now

This would be fine by me. Incrementalism is good. :)

I've removed it in #120291.

FWIW, I don't think what you're concerned about can happen with or without #120291 merged since all the instructions that isVectorOpUsedAsScalarOp deal with return OperandInfo(Unknown) and won't lead to any (incorrect) optimization.

The code that was there said that we could ignore the reduction and not call getOperandInfo on it. So it doesn't matter that the reduction is missing from getOperandInfo. The code effectively said that a scalar operand doesn't depend on any elements from the producer. This is incorrect, it demands exactly 1 element. With that code in place only the VL of the other consumers was used. If they used less than 1 element then the 1 element that scalar op needs wouldn't be valid.

michaelmaitland · 2024-12-17T20:28:34Z

Should we just remove this code for now

This would be fine by me. Incrementalism is good. :)

I've removed it in #120291.
FWIW, I don't think what you're concerned about can happen with or without #120291 merged since all the instructions that isVectorOpUsedAsScalarOp deal with return OperandInfo(Unknown) and won't lead to any (incorrect) optimization.

The code that was there said that we could ignore the reduction and not call getOperandInfo on it. So it doesn't matter that the reduction is missing from getOperandInfo. The code effectively said that a scalar operand doesn't depend on any elements from the producer. This is incorrect, it demands exactly 1 element. With that code in place only the VL of the other consumers was used. If they used less than 1 element then the 1 element that scalar op needs wouldn't be valid.

Yes, my bad. I agree we should remove the suggested code in #120291.

preames

LGTM

preames · 2024-12-17T21:06:27Z

llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-shuffles.ll

 ; CHECK-NEXT:    vmv.v.i v9, 3
+; CHECK-NEXT:    vsetivli zero, 8, e8, mf2, ta, ma


Non blocking, but this shows a case where we probably want to teach VSETVLI insertion that it can increase VL if the instruction is tail undefined.

llvm-ci · 2024-12-17T21:22:03Z

LLVM Buildbot has detected a new failure on builder cross-project-tests-sie-ubuntu running on doug-worker-1a while building llvm at step 2 "checkout".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/181/builds/10532

Here is the relevant piece of the build log for the reference

Step 2 (checkout) failure: update (failure)

lukel97 · 2024-12-18T04:21:08Z

llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-changes-length.ll

 ; RV32-NEXT:    vmv.v.i v12, 1
+; RV32-NEXT:    vsetivli zero, 8, e16, m1, ta, ma
 ; RV32-NEXT:    vmv.v.i v14, 6


Do we know why only one of the vmv.v.is had their VL reduced here?

Edit, just seeing Philip's comment above that explains it.

Take a look at the MIR: https://godbolt.org/z/xrvG13qx6

You can see that %4 is used as a tied operand. We don't optimize that case:

llvm-project/llvm/lib/Target/RISCV/RISCVVLOptimizer.cpp

Line 917 in 644643a

// Tied operands might pass through.

…19461)" This reverts commit 169c32e.

Update LLVM to llvm/llvm-project@ac8bb735. C++ changes are related to change in behavior of TypeConverter. It used to generate UnrealizedConversionCastOp, during applySignatureConversion in GenericOpTypePropagation of TypePropagationPass.cpp, however now it's not. This causes unrealized_conversion_cast to be generated later and hence survive the pass. To repro above behavior, try undo the C++ change in this PR and then: ``` wget https://gist.githubusercontent.com/raikonenfnu/dfb3b274007df8c4be87daf9ee67a5f4/raw/e48cc07e5fa558cd2c450b0e3ae46568136e1be6/type_propagate_repro.mlir iree-opt --pass-pipeline='builtin.module(func.func(iree-codegen-type-propagation))' propagate_test.mlir -o /dev/null error: failed to legalize unresolved materialization from ('i8') to ('i1') that remained live after conversion ^bb0(%in: i1, %in_0: f32, %in_1: f32, %out: f32): ^ propagate_test.mlir:5:8: note: see current operation: %10 = "builtin.unrealized_conversion_cast"(%arg0) : (i8) -> i1 propagate_test.mlir:6:11: note: see existing live user here: %10 = arith.select %9, %in_0, %in_1 : f32 ``` This PR also carries the following reverts: llvm/llvm-project#120999 llvm/llvm-project#120115 llvm/llvm-project#119461 The main issue with this PR(12099 and 120115) is it breaks matvec codegen generating scf.if instead of scf.for(s). An issue will be pushed up for repro. The main issue with PR 119461 is it breaks e2e riscv test by making it get stuck on infinite loop. ``` /path/to/iree-build/tools/iree-compile --output-format=vm-bytecode --mlir-print-op-on-diagnostic=false --iree-hal-target-backends=llvm-cpu --iree-input-type=stablehlo --iree-input-demote-f64-to-f32 --iree-llvmcpu-target-cpu=generic /path/to/iree/tests/e2e/stablehlo_ops/three_fry.mlir -o three_fly_exec_target.mlir --iree-llvmcpu-target-triple=riscv64 --iree-llvmcpu-target-abi=lp64d --iree-llvmcpu-target-cpu-features=+m,+a,+d,+zvl512b,+v --mlir-disable-threading > infinite loop ``` Signed-off-by: Stanley Winata <stanley.winata@amd.com>

…19461)" This reverts commit 169c32e.

Update LLVM to llvm/llvm-project@ac8bb735. C++ changes are related to change in behavior of TypeConverter. It used to generate UnrealizedConversionCastOp, during applySignatureConversion in GenericOpTypePropagation of TypePropagationPass.cpp, however now it's not. This causes unrealized_conversion_cast to be generated later and hence survive the pass. To repro above behavior, try undo the C++ change in this PR and then: ``` wget https://gist.githubusercontent.com/raikonenfnu/dfb3b274007df8c4be87daf9ee67a5f4/raw/e48cc07e5fa558cd2c450b0e3ae46568136e1be6/type_propagate_repro.mlir iree-opt --pass-pipeline='builtin.module(func.func(iree-codegen-type-propagation))' propagate_test.mlir -o /dev/null error: failed to legalize unresolved materialization from ('i8') to ('i1') that remained live after conversion ^bb0(%in: i1, %in_0: f32, %in_1: f32, %out: f32): ^ propagate_test.mlir:5:8: note: see current operation: %10 = "builtin.unrealized_conversion_cast"(%arg0) : (i8) -> i1 propagate_test.mlir:6:11: note: see existing live user here: %10 = arith.select %9, %in_0, %in_1 : f32 ``` This PR also carries the following reverts: llvm/llvm-project#120999 llvm/llvm-project#120115 llvm/llvm-project#119461 The main issue with this PR(12099 and 120115) is it breaks matvec codegen generating scf.if instead of scf.for(s). An issue will be pushed up for repro. The main issue with PR 119461 is it breaks e2e riscv test by making it get stuck on infinite loop. ``` /path/to/iree-build/tools/iree-compile --output-format=vm-bytecode --mlir-print-op-on-diagnostic=false --iree-hal-target-backends=llvm-cpu --iree-input-type=stablehlo --iree-input-demote-f64-to-f32 --iree-llvmcpu-target-cpu=generic /path/to/iree/tests/e2e/stablehlo_ops/three_fry.mlir -o three_fly_exec_target.mlir --iree-llvmcpu-target-triple=riscv64 --iree-llvmcpu-target-abi=lp64d --iree-llvmcpu-target-cpu-features=+m,+a,+d,+zvl512b,+v --mlir-disable-threading > infinite loop ``` Signed-off-by: Stanley Winata <stanley.winata@amd.com>

Update LLVM to llvm/llvm-project@ac8bb735. C++ changes are related to change in behavior of TypeConverter changed in iree-org/llvm-project@3cc311a. It used to generate UnrealizedConversionCastOp, during applySignatureConversion in GenericOpTypePropagation of TypePropagationPass.cpp, however now it's not. This causes unrealized_conversion_cast to be generated later and hence survive the pass. To repro above behavior, try undo the C++ change in this PR and then: ``` wget https://gist.githubusercontent.com/raikonenfnu/dfb3b274007df8c4be87daf9ee67a5f4/raw/e48cc07e5fa558cd2c450b0e3ae46568136e1be6/type_propagate_repro.mlir iree-opt --pass-pipeline='builtin.module(func.func(iree-codegen-type-propagation))' propagate_test.mlir -o /dev/null error: failed to legalize unresolved materialization from ('i8') to ('i1') that remained live after conversion ^bb0(%in: i1, %in_0: f32, %in_1: f32, %out: f32): ^ propagate_test.mlir:5:8: note: see current operation: %10 = "builtin.unrealized_conversion_cast"(%arg0) : (i8) -> i1 propagate_test.mlir:6:11: note: see existing live user here: %10 = arith.select %9, %in_0, %in_1 : f32 ``` Additionally, we made API changes in 6ed8924 from: 1. `applyPatternsAndFoldGreedily` -> `applyPatternsGreedily` 2. `applyOpPatternsAndFold` -> `applyOpPatternsGreedily` To resolve depracated API error in bazel This PR also carries the following reverts: llvm/llvm-project#119461 The main issue with PR 119461 is it breaks e2e riscv test by making it get stuck on infinite loop. ``` /path/to/iree-build/tools/iree-compile --output-format=vm-bytecode --mlir-print-op-on-diagnostic=false --iree-hal-target-backends=llvm-cpu --iree-input-type=stablehlo --iree-input-demote-f64-to-f32 --iree-llvmcpu-target-cpu=generic /path/to/iree/tests/e2e/stablehlo_ops/three_fry.mlir -o three_fly_exec_target.mlir --iree-llvmcpu-target-triple=riscv64 --iree-llvmcpu-target-abi=lp64d --iree-llvmcpu-target-cpu-features=+m,+a,+d,+zvl512b,+v --mlir-disable-threading > infinite loop ``` --------- Signed-off-by: Stanley Winata <stanley.winata@amd.com>

…19461)" This reverts commit 169c32e.

…llvm#119461)"" This reverts commit 0f42dbd.

michaelmaitland added the backend:RISC-V label Dec 10, 2024

michaelmaitland requested review from asb, preames, lukel97, topperc and wangpc-pp December 10, 2024 22:07

michaelmaitland force-pushed the enable-vlopt branch from 2768ab7 to ad48d4d Compare December 10, 2024 22:09

michaelmaitland added 2 commits December 17, 2024 08:01

fixup! update tests after rebase

Loading
Loading status checks…

7ffa9e5

michaelmaitland force-pushed the enable-vlopt branch from ad48d4d to 7ffa9e5 Compare December 17, 2024 16:02

preames approved these changes Dec 17, 2024

View reviewed changes

michaelmaitland merged commit 169c32e into llvm:main Dec 17, 2024
4 of 7 checks passed

michaelmaitland deleted the enable-vlopt branch December 17, 2024 21:19

lukel97 reviewed Dec 18, 2024

View reviewed changes

raikonenfnu added a commit to iree-org/llvm-project that referenced this pull request Dec 26, 2024

Revert "[RISCV][VLOPT] Enable the RISCVVLOptimizer by default (llvm#1…

13ae7e4

…19461)" This reverts commit 169c32e.

raikonenfnu mentioned this pull request Dec 26, 2024

Update LLVM to llvm/llvm-project@b13592219c421820b iree-org/iree#19554

Merged

raikonenfnu added a commit to iree-org/llvm-project that referenced this pull request Dec 26, 2024

Revert "[RISCV][VLOPT] Enable the RISCVVLOptimizer by default (llvm#1…

6efaa90

…19461)" This reverts commit 169c32e.

raikonenfnu mentioned this pull request Dec 26, 2024

Update LLVM to llvm/llvm-project@cea738bc iree-org/iree#19561

Closed

raikonenfnu mentioned this pull request Dec 27, 2024

Update LLVM to llvm/llvm-project@ac8bb735 iree-org/iree#19566

Merged

raikonenfnu added a commit to iree-org/llvm-project that referenced this pull request Dec 27, 2024

Revert "[RISCV][VLOPT] Enable the RISCVVLOptimizer by default (llvm#1…

38b4542

…19461)" This reverts commit 169c32e.

Groverkss pushed a commit to iree-org/llvm-project that referenced this pull request Dec 27, 2024

Revert "[RISCV][VLOPT] Enable the RISCVVLOptimizer by default (llvm#1…

82c3727

…19461)" This reverts commit 169c32e.

Groverkss pushed a commit to iree-org/llvm-project that referenced this pull request Dec 27, 2024

Revert "[RISCV][VLOPT] Enable the RISCVVLOptimizer by default (llvm#1…

79ea81d

…19461)" This reverts commit 169c32e.

IanWood1 pushed a commit to iree-org/llvm-project that referenced this pull request Jan 2, 2025

Revert "[RISCV][VLOPT] Enable the RISCVVLOptimizer by default (llvm#1…

f05f42e

…19461)" This reverts commit 169c32e.

IanWood1 mentioned this pull request Jan 2, 2025

Update LLVM to llvm/llvm-project@bca92b1 iree-org/iree#19585

Closed

MaheshRavishankar added a commit to iree-org/llvm-project that referenced this pull request Jan 2, 2025

Revert "[RISCV][VLOPT] Enable the RISCVVLOptimizer by default (llvm#1…

b612008

…19461)" This reverts commit 169c32e.

MaheshRavishankar added a commit to iree-org/llvm-project that referenced this pull request Jan 3, 2025

Revert "[RISCV][VLOPT] Enable the RISCVVLOptimizer by default (llvm#1…

b5f3be2

…19461)" This reverts commit 169c32e.

MaheshRavishankar added a commit to iree-org/llvm-project that referenced this pull request Jan 7, 2025

Revert "[RISCV][VLOPT] Enable the RISCVVLOptimizer by default (llvm#1…

103a73f

…19461)" This reverts commit 169c32e.

MaheshRavishankar added a commit to iree-org/llvm-project that referenced this pull request Jan 13, 2025

Revert "[RISCV][VLOPT] Enable the RISCVVLOptimizer by default (llvm#1…

3785531

…19461)" This reverts commit 169c32e.

nirvedhmeshram pushed a commit to iree-org/llvm-project that referenced this pull request Jan 20, 2025

Revert "[RISCV][VLOPT] Enable the RISCVVLOptimizer by default (llvm#1…

0f42dbd

…19461)" This reverts commit 169c32e.

MaheshRavishankar mentioned this pull request Jan 22, 2025

Generation of Object file gets stuck in infinite loop after enabling RISCVVLOptimizer #123862

Closed

MaheshRavishankar added a commit to iree-org/llvm-project that referenced this pull request Jan 22, 2025

Revert "Revert "[RISCV][VLOPT] Enable the RISCVVLOptimizer by default (…

40367f2

…llvm#119461)"" This reverts commit 0f42dbd.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RISCV][VLOPT] Enable the RISCVVLOptimizer by default #119461

[RISCV][VLOPT] Enable the RISCVVLOptimizer by default #119461

michaelmaitland commented Dec 10, 2024 •

edited

Loading

llvmbot commented Dec 10, 2024

michaelmaitland commented Dec 17, 2024

lukel97 commented Dec 17, 2024

michaelmaitland commented Dec 17, 2024

preames commented Dec 17, 2024

lukel97 commented Dec 17, 2024

michaelmaitland commented Dec 17, 2024

topperc commented Dec 17, 2024 •

edited

Loading

preames commented Dec 17, 2024

michaelmaitland commented Dec 17, 2024

topperc commented Dec 17, 2024 •

edited

Loading

michaelmaitland commented Dec 17, 2024 •

edited

Loading

preames left a comment

preames Dec 17, 2024

llvm-ci commented Dec 17, 2024

lukel97 Dec 18, 2024 •

edited

Loading

michaelmaitland Dec 18, 2024 •

edited

Loading

		; CHECK-NEXT: vmv.v.i v9, 3
		; CHECK-NEXT: vsetivli zero, 8, e8, mf2, ta, ma

[RISCV][VLOPT] Enable the RISCVVLOptimizer by default #119461

[RISCV][VLOPT] Enable the RISCVVLOptimizer by default #119461

Conversation

michaelmaitland commented Dec 10, 2024 • edited Loading

llvmbot commented Dec 10, 2024

michaelmaitland commented Dec 17, 2024

lukel97 commented Dec 17, 2024

michaelmaitland commented Dec 17, 2024

preames commented Dec 17, 2024

lukel97 commented Dec 17, 2024

michaelmaitland commented Dec 17, 2024

topperc commented Dec 17, 2024 • edited Loading

preames commented Dec 17, 2024

michaelmaitland commented Dec 17, 2024

topperc commented Dec 17, 2024 • edited Loading

michaelmaitland commented Dec 17, 2024 • edited Loading

preames left a comment

Choose a reason for hiding this comment

preames Dec 17, 2024

Choose a reason for hiding this comment

llvm-ci commented Dec 17, 2024

lukel97 Dec 18, 2024 • edited Loading

Choose a reason for hiding this comment

michaelmaitland Dec 18, 2024 • edited Loading

Choose a reason for hiding this comment

michaelmaitland commented Dec 10, 2024 •

edited

Loading

topperc commented Dec 17, 2024 •

edited

Loading

topperc commented Dec 17, 2024 •

edited

Loading

michaelmaitland commented Dec 17, 2024 •

edited

Loading

lukel97 Dec 18, 2024 •

edited

Loading

michaelmaitland Dec 18, 2024 •

edited

Loading