-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[InstCombine] Fold shift+cttz with power of 2 operands #127055
Conversation
(llvm#121386) Introduced cttz intrinsics which caused a regression where vscale/vscale divisions could no longer be constant folded. This fold was suggested as a fix in (llvm#126411)
@llvm/pr-subscribers-llvm-transforms Author: Matthew Devereau (MDevereau) Changes#121386 Introduced cttz intrinsics which caused a regression where vscale/vscale divisions could no longer be constant folded. This fold was suggested as a fix in #126411. Full diff: https://github.com/llvm/llvm-project/pull/127055.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp b/llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
index 7ef95800975db..ac0f9b005f317 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
@@ -1613,6 +1613,22 @@ Instruction *InstCombinerImpl::visitLShr(BinaryOperator &I) {
if (Instruction *Overflow = foldLShrOverflowBit(I))
return Overflow;
+ // Transform ((pow2 << x) >> cttz(pow2 << y)) -> ((1 << x) >> y)
+ Value *Shl0_Op0, *Shl0_Op1, *Shl1_Op0, *Shl1_Op1;
+ BinaryOperator *Shl1;
+ if (match(Op0, m_Shl(m_Value(Shl0_Op0), m_Value(Shl0_Op1))) &&
+ match(Op1, m_Intrinsic<Intrinsic::cttz>(m_BinOp(Shl1))) &&
+ match(Shl1, m_Shl(m_Value(Shl1_Op0), m_Value(Shl1_Op1))) &&
+ isKnownToBeAPowerOfTwo(Shl1, false, 0, SQ.getWithInstruction(&I).CxtI) &&
+ Shl0_Op0 == Shl1_Op0) {
+ auto *Shl0 = cast<BinaryOperator>(Op0);
+ if ((Shl0->hasNoUnsignedWrap() && Shl1->hasNoUnsignedWrap()) ||
+ (Shl0->hasNoSignedWrap() && Shl1->hasNoSignedWrap())) {
+ Value *NewShl =
+ Builder.CreateShl(ConstantInt::get(Shl1->getType(), 1), Shl0_Op1);
+ return BinaryOperator::CreateLShr(NewShl, Shl1_Op1);
+ }
+ }
return nullptr;
}
diff --git a/llvm/test/Transforms/InstCombine/shift-cttz-ctlz.ll b/llvm/test/Transforms/InstCombine/shift-cttz-ctlz.ll
index 63caec9501325..6269f29c880e3 100644
--- a/llvm/test/Transforms/InstCombine/shift-cttz-ctlz.ll
+++ b/llvm/test/Transforms/InstCombine/shift-cttz-ctlz.ll
@@ -103,4 +103,38 @@ entry:
ret i32 %res
}
+define i64 @fold_cttz_64() vscale_range(1,16) {
+; CHECK-LABEL: define i64 @fold_cttz_64(
+; CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: ret i64 4
+;
+entry:
+ %0 = tail call i64 @llvm.vscale.i64()
+ %1 = shl nuw nsw i64 %0, 4
+ %2 = shl nuw nsw i64 %0, 2
+ %3 = tail call range(i64 2, 65) i64 @llvm.cttz.i64(i64 %2, i1 true)
+ %div1 = lshr i64 %1, %3
+ ret i64 %div1
+}
+
+define i32 @fold_cttz_32() vscale_range(1,16) {
+; CHECK-LABEL: define i32 @fold_cttz_32(
+; CHECK-SAME: ) #[[ATTR0]] {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: ret i32 4
+;
+entry:
+ %0 = tail call i32 @llvm.vscale.i32()
+ %1 = shl nuw nsw i32 %0, 4
+ %2 = shl nuw nsw i32 %0, 2
+ %3 = tail call range(i32 2, 65) i32 @llvm.cttz.i32(i32 %2, i1 true)
+ %div1 = lshr i32 %1, %3
+ ret i32 %div1
+}
+
+declare i64 @llvm.vscale.i64()
+declare i64 @llvm.cttz.i64(i64, i1 immarg)
+declare i32 @llvm.vscale.i32()
+declare i32 @llvm.cttz.i32(i32, i1 immarg)
declare void @use(i32)
|
Propagate nowrap flags to new shift Use named values in tests Remove intrinsic declarations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
llvm#121386 Introduced cttz intrinsics which caused a regression where vscale/vscale divisions could no longer be constant folded. This fold was suggested as a fix in llvm#126411. https://alive2.llvm.org/ce/z/gWbtPw
See llvm#126411 / llvm#127055, the test isn't expected to fold in a single instcombine iteration, needing instcombine->cse->instcombine.
llvm#121386 Introduced cttz intrinsics which caused a regression where vscale/vscale divisions could no longer be constant folded. This fold was suggested as a fix in llvm#126411. https://alive2.llvm.org/ce/z/gWbtPw
See llvm#126411 / llvm#127055, the test isn't expected to fold in a single instcombine iteration, needing instcombine->cse->instcombine.
llvm#121386 Introduced cttz intrinsics which caused a regression where vscale/vscale divisions could no longer be constant folded. This fold was suggested as a fix in llvm#126411. https://alive2.llvm.org/ce/z/gWbtPw
See llvm#126411 / llvm#127055, the test isn't expected to fold in a single instcombine iteration, needing instcombine->cse->instcombine.
#121386 Introduced cttz intrinsics which caused a regression where vscale/vscale divisions could no longer be constant folded.
This fold was suggested as a fix in #126411. https://alive2.llvm.org/ce/z/gWbtPw