Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[InstCombine] Fold shift+cttz with power of 2 operands #127055

Merged
merged 3 commits into from
Feb 18, 2025

Conversation

MDevereau
Copy link
Contributor

@MDevereau MDevereau commented Feb 13, 2025

#121386 Introduced cttz intrinsics which caused a regression where vscale/vscale divisions could no longer be constant folded.

This fold was suggested as a fix in #126411. https://alive2.llvm.org/ce/z/gWbtPw

(llvm#121386) Introduced cttz intrinsics which caused a regression
where vscale/vscale divisions could no longer be constant folded.

This fold was suggested as a fix in (llvm#126411)
@llvmbot
Copy link
Member

llvmbot commented Feb 13, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Matthew Devereau (MDevereau)

Changes

#121386 Introduced cttz intrinsics which caused a regression where vscale/vscale divisions could no longer be constant folded.

This fold was suggested as a fix in #126411.


Full diff: https://github.com/llvm/llvm-project/pull/127055.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp (+16)
  • (modified) llvm/test/Transforms/InstCombine/shift-cttz-ctlz.ll (+34)
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp b/llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
index 7ef95800975db..ac0f9b005f317 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
@@ -1613,6 +1613,22 @@ Instruction *InstCombinerImpl::visitLShr(BinaryOperator &I) {
   if (Instruction *Overflow = foldLShrOverflowBit(I))
     return Overflow;
 
+  // Transform ((pow2 << x) >> cttz(pow2 << y)) -> ((1 << x) >> y)
+  Value *Shl0_Op0, *Shl0_Op1, *Shl1_Op0, *Shl1_Op1;
+  BinaryOperator *Shl1;
+  if (match(Op0, m_Shl(m_Value(Shl0_Op0), m_Value(Shl0_Op1))) &&
+      match(Op1, m_Intrinsic<Intrinsic::cttz>(m_BinOp(Shl1))) &&
+      match(Shl1, m_Shl(m_Value(Shl1_Op0), m_Value(Shl1_Op1))) &&
+      isKnownToBeAPowerOfTwo(Shl1, false, 0, SQ.getWithInstruction(&I).CxtI) &&
+      Shl0_Op0 == Shl1_Op0) {
+    auto *Shl0 = cast<BinaryOperator>(Op0);
+    if ((Shl0->hasNoUnsignedWrap() && Shl1->hasNoUnsignedWrap()) ||
+        (Shl0->hasNoSignedWrap() && Shl1->hasNoSignedWrap())) {
+      Value *NewShl =
+          Builder.CreateShl(ConstantInt::get(Shl1->getType(), 1), Shl0_Op1);
+      return BinaryOperator::CreateLShr(NewShl, Shl1_Op1);
+    }
+  }
   return nullptr;
 }
 
diff --git a/llvm/test/Transforms/InstCombine/shift-cttz-ctlz.ll b/llvm/test/Transforms/InstCombine/shift-cttz-ctlz.ll
index 63caec9501325..6269f29c880e3 100644
--- a/llvm/test/Transforms/InstCombine/shift-cttz-ctlz.ll
+++ b/llvm/test/Transforms/InstCombine/shift-cttz-ctlz.ll
@@ -103,4 +103,38 @@ entry:
   ret i32 %res
 }
 
+define i64 @fold_cttz_64() vscale_range(1,16) {
+; CHECK-LABEL: define i64 @fold_cttz_64(
+; CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    ret i64 4
+;
+entry:
+  %0 = tail call i64 @llvm.vscale.i64()
+  %1 = shl nuw nsw i64 %0, 4
+  %2 = shl nuw nsw i64 %0, 2
+  %3 = tail call range(i64 2, 65) i64 @llvm.cttz.i64(i64 %2, i1 true)
+  %div1 = lshr i64 %1, %3
+  ret i64 %div1
+}
+
+define i32 @fold_cttz_32() vscale_range(1,16) {
+; CHECK-LABEL: define i32 @fold_cttz_32(
+; CHECK-SAME: ) #[[ATTR0]] {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    ret i32 4
+;
+entry:
+  %0 = tail call i32 @llvm.vscale.i32()
+  %1 = shl nuw nsw i32 %0, 4
+  %2 = shl nuw nsw i32 %0, 2
+  %3 = tail call range(i32 2, 65) i32 @llvm.cttz.i32(i32 %2, i1 true)
+  %div1 = lshr i32 %1, %3
+  ret i32 %div1
+}
+
+declare i64 @llvm.vscale.i64()
+declare i64 @llvm.cttz.i64(i64, i1 immarg)
+declare i32 @llvm.vscale.i32()
+declare i32 @llvm.cttz.i32(i32, i1 immarg)
 declare void @use(i32)

Propagate nowrap flags to new shift
Use named values in tests
Remove intrinsic declarations
Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@MDevereau MDevereau merged commit 251377c into llvm:main Feb 18, 2025
8 checks passed
@MDevereau MDevereau deleted the svcnt branch February 18, 2025 10:26
davemgreen added a commit that referenced this pull request Feb 18, 2025
See #126411 / #127055, the test isn't expected to fold in a single instcombine
iteration, needing instcombine->cse->instcombine.
wldfngrs pushed a commit to wldfngrs/llvm-project that referenced this pull request Feb 19, 2025
llvm#121386 Introduced cttz intrinsics which caused a regression where
vscale/vscale divisions could no longer be constant folded.

This fold was suggested as a fix in llvm#126411.
https://alive2.llvm.org/ce/z/gWbtPw
wldfngrs pushed a commit to wldfngrs/llvm-project that referenced this pull request Feb 19, 2025
See llvm#126411 / llvm#127055, the test isn't expected to fold in a single instcombine
iteration, needing instcombine->cse->instcombine.
YutongZhuu pushed a commit to YutongZhuu/llvm-project that referenced this pull request Mar 8, 2025
llvm#121386 Introduced cttz intrinsics which caused a regression where
vscale/vscale divisions could no longer be constant folded.

This fold was suggested as a fix in llvm#126411.
https://alive2.llvm.org/ce/z/gWbtPw
YutongZhuu pushed a commit to YutongZhuu/llvm-project that referenced this pull request Mar 8, 2025
See llvm#126411 / llvm#127055, the test isn't expected to fold in a single instcombine
iteration, needing instcombine->cse->instcombine.
YutongZhuu pushed a commit to YutongZhuu/llvm-project that referenced this pull request Mar 8, 2025
llvm#121386 Introduced cttz intrinsics which caused a regression where
vscale/vscale divisions could no longer be constant folded.

This fold was suggested as a fix in llvm#126411.
https://alive2.llvm.org/ce/z/gWbtPw
YutongZhuu pushed a commit to YutongZhuu/llvm-project that referenced this pull request Mar 8, 2025
See llvm#126411 / llvm#127055, the test isn't expected to fold in a single instcombine
iteration, needing instcombine->cse->instcombine.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants