Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ssa): Basic control dependent LICM #7660

Draft
wants to merge 30 commits into
base: mv/post-dominator-tree
Choose a base branch
from

Conversation

vezenovm
Copy link
Contributor

@vezenovm vezenovm commented Mar 11, 2025

Description

Problem*

Resolves #7569

This PR resolves the optimizations described in the issue. However, the optimization can be expanded by further accounting for predicates and possibly even hoisting instructions under predicates as part of loop peeling/splitting. I leave these tasks to be captured in separate issues.

Summary*

We have certain instructions that we mark as unsafe for hoisting when they are only dependent upon a predicate and do not have any other side effects (e.g. a checked arithmetic operation). You can look into the linked issue for an example of logic that will not be hoisted.

If a predicated instruction shares the same predicate as the entire loop we can now hoist this instruction. The new unit tests under loop_invariant.rs provide examples of when a predicated instruction can now be hoisted.

We check for whether an instruction shares a predicate with the entire loop through control dependence analysis. If the block containing an instruction is control dependent on any blocks between the pre-header and the instruction's block we know that there has been some branching. We then mark the current block as being control dependent.

If a block is not control dependent, we mark instructions that are reliant upon predicates (but do not have other side effects) as safe for hoisting.

We determine whether a block is control dependent on a parent block by following Definition 3 in Ferrante et al., 1987, The program dependence graph and its use in optimization.

Definition 3. Let G be a control flow graph. Let X and Y be nodes in G. Y is
control dependent on X iff
(1) there exists a directed path P from X to Y with any 2 in P (excluding X
and Y) post-dominated by Y and
(2) X is not post-dominated by Y. 

Builds upon #7595 to utilize the post-dominator tree. We can further optimize our control dependence check by looking at the post-dominator frontiers (which we will refer to as reverse dominance frontiers).

As per, Morgan, B. (97) Building an Optimizing Compiler, the dominance frontier DF(B) of a block B is the set of all blocks C such that B dominates a predecessor of C but either B equals C or B does not dominate C.

Switching the above statement from dominance to post-dominance, we can see that the conditions for control dependence are the same as that of post-dominance frontiers. In fact, we can rewrite our control dependence definition from above to the following: Y is control dependent on X iff Y is in PDF(X).

If we were to check the control dependence as defined in Ferrante et al., 1987, The program dependence graph and its use in optimization, this would require n^2 in the worst case for checking whether a given loop block is control dependent on any block between it and the loop pre header. Generating post-dominance frontiers allows us to move to a linear worst case control dependency check.

Additional Context

Documentation*

Check one:

  • No documentation needed.
  • Documentation included in this PR.
  • [For Experimental Features] Documentation to be submitted in a separate PR.

PR Checklist*

  • I have tested the changes locally.
  • I have formatted the changes with Prettier and/or cargo fmt on default settings.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Test Suite Duration'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.20.

Benchmark suite Current: fbebe85 Previous: 6eaea18 Ratio
noir-lang_noir_bigcurve_ 261 s 199 s 1.31

This comment was automatically generated by workflow using github-action-benchmark.

CC: @TomAFrench

Copy link
Contributor

github-actions bot commented Mar 11, 2025

Changes to Brillig bytecode sizes

Generated at commit: 4fcc14219723e479b9f481ee2cd4ecc4c651ba65, compared to commit: e70f71959539e3d5f39fd53f292304cacddde276

🧾 Summary (10% most significant diffs)

Program Brillig opcodes (+/-) %
poseidon_bn254_hash_width_3_inliner_min -12 ✅ -0.24%
poseidon_bn254_hash_inliner_min -12 ✅ -0.24%
loop_invariant_regression_inliner_max -16 ✅ -10.67%
loop_invariant_regression_inliner_zero -16 ✅ -10.67%

Full diff report 👇
Program Brillig opcodes (+/-) %
poseidon_bn254_hash_width_3_inliner_max 5,242 (-2) -0.04%
poseidon_bn254_hash_inliner_max 5,242 (-2) -0.04%
poseidon_bn254_hash_width_3_inliner_zero 4,656 (-2) -0.04%
poseidon_bn254_hash_inliner_zero 4,656 (-2) -0.04%
regression_5252_inliner_max 4,434 (-2) -0.05%
poseidonsponge_x5_254_inliner_max 4,098 (-2) -0.05%
regression_5252_inliner_min 3,555 (-6) -0.17%
regression_5252_inliner_zero 3,390 (-6) -0.18%
poseidonsponge_x5_254_inliner_min 3,159 (-6) -0.19%
poseidonsponge_x5_254_inliner_zero 3,019 (-6) -0.20%
poseidon_bn254_hash_width_3_inliner_min 5,023 (-12) -0.24%
poseidon_bn254_hash_inliner_min 5,023 (-12) -0.24%
loop_invariant_regression_inliner_max 134 (-16) -10.67%
loop_invariant_regression_inliner_zero 134 (-16) -10.67%

Copy link
Contributor

github-actions bot commented Mar 11, 2025

Changes to number of Brillig opcodes executed

Generated at commit: 4fcc14219723e479b9f481ee2cd4ecc4c651ba65, compared to commit: e70f71959539e3d5f39fd53f292304cacddde276

🧾 Summary (10% most significant diffs)

Program Brillig opcodes (+/-) %
loop_invariant_regression_inliner_min -816 ✅ -46.44%
loop_invariant_regression_inliner_max -689 ✅ -64.51%
loop_invariant_regression_inliner_zero -689 ✅ -64.51%

Full diff report 👇
Program Brillig opcodes (+/-) %
brillig_cow_regression_inliner_min 220,181 (-180) -0.08%
brillig_cow_regression_inliner_max 216,973 (-180) -0.08%
brillig_cow_regression_inliner_zero 216,973 (-180) -0.08%
poseidon_bn254_hash_inliner_zero 181,451 (-1,509) -0.82%
poseidon_bn254_hash_width_3_inliner_zero 181,451 (-1,509) -0.82%
poseidon_bn254_hash_inliner_max 150,934 (-1,509) -0.99%
poseidon_bn254_hash_width_3_inliner_max 150,934 (-1,509) -0.99%
poseidonsponge_x5_254_inliner_max 170,888 (-2,160) -1.25%
regression_5252_inliner_max 851,025 (-10,800) -1.25%
poseidon_bn254_hash_inliner_min 185,155 (-3,629) -1.92%
poseidon_bn254_hash_width_3_inliner_min 185,155 (-3,629) -1.92%
poseidonsponge_x5_254_inliner_min 205,657 (-5,098) -2.42%
regression_5252_inliner_min 1,027,782 (-25,490) -2.42%
poseidonsponge_x5_254_inliner_zero 172,671 (-34,008) -16.45%
regression_5252_inliner_zero 860,337 (-170,040) -16.50%
loop_invariant_regression_inliner_min 941 (-816) -46.44%
loop_invariant_regression_inliner_max 379 (-689) -64.51%
loop_invariant_regression_inliner_zero 379 (-689) -64.51%

… to avoid repeat work, allow control dependence checks on acir again
@vezenovm vezenovm added the bench-show Display benchmark results on PR label Mar 11, 2025
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compilation Time

Benchmark suite Current: 096dcb3 Previous: 9983007 Ratio
regression_4709 0.698 s 0.715 s 0.98
ram_blowup_regression 14.8 s 15 s 0.99
global_var_regression_entry_points 0.48 s 0.497 s 0.97
private-kernel-inner 2.312 s 2.314 s 1.00
private-kernel-reset 7.126 s 6.86 s 1.04
private-kernel-tail 1.2 s 1.206 s 1.00
rollup-base-private 15.72 s 14.8 s 1.06
rollup-base-public 11.08 s 10.9 s 1.02
rollup-block-root-empty 0.918 s 0.933 s 0.98
rollup-block-root-single-tx 120 s 125 s 0.96
rollup-block-root 123 s 128 s 0.96
rollup-merge 0.91 s 0.9 s 1.01
rollup-root 1.486 s 1.498 s 0.99

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Execution Time

Benchmark suite Current: 096dcb3 Previous: 9983007 Ratio
private-kernel-inner 0.069 s 0.068 s 1.01
private-kernel-reset 0.284 s 0.285 s 1.00
private-kernel-tail 0.026 s 0.027 s 0.96
rollup-base-private 0.734 s 0.728 s 1.01
rollup-base-public 0.484 s 0.482 s 1.00
rollup-block-root 17 s 16.9 s 1.01
rollup-merge 0.006 s 0.006 s 1
rollup-root 0.024 s 0.024 s 1

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Compilation Time'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.20.

Benchmark suite Current: 60a01ba Previous: 4c1c76e Ratio
regression_4709 0.893 s 0.693 s 1.29

This comment was automatically generated by workflow using github-action-benchmark.

CC: @TomAFrench

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test Suite Duration

Benchmark suite Current: 096dcb3 Previous: 9983007 Ratio
AztecProtocol_aztec-packages_noir-projects_aztec-nr 41 s 41 s 1
AztecProtocol_aztec-packages_noir-projects_noir-contracts 75 s 76 s 0.99
AztecProtocol_aztec-packages_noir-projects_noir-protocol-circuits_crates_blob 48 s 42 s 1.14
AztecProtocol_aztec-packages_noir-projects_noir-protocol-circuits_crates_private-kernel-lib 170 s 167 s 1.02
AztecProtocol_aztec-packages_noir-projects_noir-protocol-circuits_crates_reset-kernel-lib 10 s 10 s 1
AztecProtocol_aztec-packages_noir-projects_noir-protocol-circuits_crates_rollup-lib 166 s 157 s 1.06
AztecProtocol_aztec-packages_noir-projects_noir-protocol-circuits_crates_types 54 s 54 s 1
noir-lang_noir-bignum_ 79 s 85 s 0.93
noir-lang_noir_bigcurve_ 209 s 209 s 1
noir-lang_noir_json_parser_ 9 s 8 s 1.13
noir-lang_sha512_ 23 s 25 s 0.92

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compilation Memory

Benchmark suite Current: 096dcb3 Previous: 9983007 Ratio
private-kernel-inner 299.97 MB 300.04 MB 1.00
private-kernel-reset 609.94 MB 609.94 MB 1
private-kernel-tail 226.62 MB 226.64 MB 1.00
rollup-base-private 1250 MB 1250 MB 1
rollup-base-public 1330 MB 1330 MB 1
rollup-block-root-empty 303.45 MB 303.44 MB 1.00
rollup-block-root-single-tx 7860 MB 7860 MB 1
rollup-block-root 7870 MB 7870 MB 1
rollup-merge 301.87 MB 301.88 MB 1.00
rollup-root 349.43 MB 349.4 MB 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Execution Memory

Benchmark suite Current: 096dcb3 Previous: 9983007 Ratio
private-kernel-inner 239.85 MB 239.85 MB 1
private-kernel-reset 274.25 MB 274.25 MB 1
private-kernel-tail 213.48 MB 213.48 MB 1
rollup-base-private 507.63 MB 507.63 MB 1
rollup-base-public 416.42 MB 416.42 MB 1
rollup-block-root 1420 MB 1420 MB 1
rollup-merge 290.45 MB 290.45 MB 1
rollup-root 296.92 MB 296.92 MB 1

This comment was automatically generated by workflow using github-action-benchmark.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bench-show Display benchmark results on PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant