Cranelift: remove block params on critical-edge blocks.#10485
Merged
fitzgen merged 1 commit intobytecodealliance:mainfrom Mar 31, 2025
Merged
Cranelift: remove block params on critical-edge blocks.#10485fitzgen merged 1 commit intobytecodealliance:mainfrom
fitzgen merged 1 commit intobytecodealliance:mainfrom
Conversation
When a block has a terminator branch that targets two or more other blocks at the CLIF level, and any of these blocks have two or more precessors, the edge is a "critical edge" and we split it (insert a new empty block) so that the register allocator has a place to put moves that happen only on that edge. Otherwise, there is no location that works: in the predecessor, code runs no matter which outgoing edge we take; and in the successor, code runs no matter which incoming edge we came from. Currently, when we generate these critical-edge blocks, we insert exactly one instruction: an unconditional branch. We wire up the blockparam dataflow by (i) adding block parameters to the critical-edge block with the same signature as the original target, and (ii) adding all of these arguments to the unconditional branch. In other words, we maintain the original block signature throughout. This is fine and correct, but it has two downsides. The first is a minor loss in compile-time efficiency (more SSA values and block-params to process). The second, more interesting, is that it hinders future work with certain kinds of branches that may define values *on edges*. In particular, this approach prevents exception-handling support: a `try_call` instruction that acts as a terminator branch (with normal-return and exceptional out-edges) defines normal-return values as block-call arguments that are usable on the normal-return edge. Some of these normal-return values may be defined by loads from a return-value area. These loads need somewhere to go; they can't go "after the terminator" (then it wouldn't be a terminator), so they go in an edge block; as a result, the block-call for the normal-return needs to use its arguments only in the unconditional branch out of the edge block, not in the initial branch to the edge block. This PR alters the critical-edge blockparam handling to have no block-call args on the branch into the edge block, and use the original values (not the newly defined edge-block blockparams) in the block-call out of the edge block. This will allow these values to be possibly defined in the edge block rather than in the predecessor (the block with the original terminator). This has no functional change today other than some perturbation of regalloc decisions and a possibly slight compile-time speedup.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When a block has a terminator branch that targets two or more other blocks at the CLIF level, and any of these blocks have two or more precessors, the edge is a "critical edge" and we split it (insert a new empty block) so that the register allocator has a place to put moves that happen only on that edge. Otherwise, there is no location that works: in the predecessor, code runs no matter which outgoing edge we take; and in the successor, code runs no matter which incoming edge we came from.
Currently, when we generate these critical-edge blocks, we insert exactly one instruction: an unconditional branch. We wire up the blockparam dataflow by (i) adding block parameters to the critical-edge block with the same signature as the original target, and (ii) adding all of these arguments to the unconditional branch. In other words, we maintain the original block signature throughout.
This is fine and correct, but it has two downsides. The first is a minor loss in compile-time efficiency (more SSA values and block-params to process). The second, more interesting, is that it hinders future work with certain kinds of branches that may define values on edges.
In particular, this approach prevents exception-handling support: a
try_callinstruction that acts as a terminator branch (with normal-return and exceptional out-edges) defines normal-return values as block-call arguments that are usable on the normal-return edge. Some of these normal-return values may be defined by loads from a return-value area. These loads need somewhere to go; they can't go "after the terminator" (then it wouldn't be a terminator), so they go in an edge block; as a result, the block-call for the normal-return needs to use its arguments only in the unconditional branch out of the edge block, not in the initial branch to the edge block.This PR alters the critical-edge blockparam handling to have no block-call args on the branch into the edge block, and use the original values (not the newly defined edge-block blockparams) in the block-call out of the edge block. This will allow these values to be possibly defined in the edge block rather than in the predecessor (the block with the original terminator).
This has no functional change today other than some perturbation of regalloc decisions and a possibly slight compile-time speedup.