Skip to content

[query] no-sharing in MatrixIR lowering#15513

Open
ehigham wants to merge 1 commit into
hail-is:mainfrom
ehigham:ehigham/no-sharing-matrix-ir
Open

[query] no-sharing in MatrixIR lowering#15513
ehigham wants to merge 1 commit into
hail-is:mainfrom
ehigham:ehigham/no-sharing-matrix-ir

Conversation

@ehigham
Copy link
Copy Markdown
Member

@ehigham ehigham commented May 28, 2026

Partially resolves #13250
Succeeds #15520

This change is one in a series that aims to preserve the TreeIR invariant throught the various lowerings, culminating in the removal of noSharing.

In this change, MatrixIR lowerings generate a name-normalised tree-ir.
This is enforced via the before and after invariants in LowerMatrixToTablePass.
The major changes include:

  • LowerMatrixIR:
    • Shadows names in each applicable binding scope instead of substwith a particular BindingEnv
  • DeprecatedIRBuilder:
    • Uses a full BindingEnv as shadowing names in diffenent binding scopes requires separate environments
  • MatrixWriter
    • Partial re-writes of lowerings to use Memoized to reduce indent.
    • Simply lowerings though
      • removing array intermediates (MatrixBlockMatrixWriter)
      • removing unnecessary cda (MatrixBGENWriter)
  • Naming:
    • Weak adopt convention that compiler-generated names are prefixed by __
    • Prefer hard-coded names over freshNameas it helps with Pretty output (especially with preserveNames = true)
  • General
    • prefer makestuct(..) over MakeStruct(FastSeq(..))
    • use ir helper functions where appropriate
    • re-use bindings where possible
    • lift and bind loop-invariant expressions

Comment thread hail/hail/src/is/hail/expr/ir/LowerMatrixIR.scala Outdated
Comment thread hail/hail/src/is/hail/expr/ir/MatrixWriter.scala Outdated
@ehigham ehigham force-pushed the ehigham/no-sharing-matrix-ir branch from d43d087 to b98c707 Compare May 28, 2026 20:20
@ehigham ehigham force-pushed the ehigham/no-sharing-matrix-ir branch 2 times, most recently from 2817262 to c7ad1e1 Compare May 28, 2026 20:48
@ehigham ehigham changed the title [query] No-Sharing in MatrixIR lowering [query] no-sharing in MatrixIR lowering May 29, 2026
@ehigham ehigham force-pushed the ehigham/no-sharing-matrix-ir branch from c7ad1e1 to 16a34ad Compare June 2, 2026 02:25
Comment thread hail/hail/src/is/hail/expr/ir/LowerMatrixIR.scala
@ehigham ehigham force-pushed the ehigham/no-sharing-matrix-ir branch 11 times, most recently from 69d3574 to a400d2c Compare June 4, 2026 16:55
@ehigham ehigham force-pushed the ehigham/no-sharing-matrix-ir branch 4 times, most recently from 1bc0440 to 7dd6844 Compare June 5, 2026 15:27
maxLen: Int = -1,
allowUnboundRefs: Boolean = false,
preserveNames: Boolean = false,
preserveNames: Boolean = true,
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling this out - preserveNames = true is a much more useful default for me.
For lowering invariants, you want to see where a name is bound thoughout a stacktrace. After extract, you want to identify the subexpression that was lifted, etc.

@ehigham ehigham force-pushed the ehigham/no-sharing-matrix-ir branch 5 times, most recently from b84c6ad to efff260 Compare June 5, 2026 17:17
@ehigham ehigham force-pushed the ehigham/no-sharing-matrix-ir branch 3 times, most recently from 457da83 to 73f77ad Compare June 5, 2026 20:48
@ehigham ehigham removed the stacked PR label Jun 5, 2026
@ehigham ehigham force-pushed the ehigham/no-sharing-matrix-ir branch from 73f77ad to deb6043 Compare June 5, 2026 21:16
@ehigham ehigham requested a review from Copilot June 5, 2026 21:23
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates MatrixIR lowering and related IR-building utilities to preserve the TreeIR “no-sharing / name-normalized” invariant throughout matrix-to-table lowering, as part of the broader effort to reduce repeated NormalizeNames invocations and eventually remove noSharing.

Changes:

  • Refactors LowerMatrixIR to avoid subst-based rewriting and instead introduce bindings via scoped shadowing (with a final NormalizeNames on the result).
  • Updates DeprecatedIRBuilder to use a full BindingEnv (eval/agg/scan) so that shadowing works correctly across binding scopes.
  • Simplifies/rewrites multiple MatrixWriter lowerings (e.g., using Memoized, removing intermediates), plus small cleanups across tests and pretty-printing defaults.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
hail/hail/test/src/is/hail/HailSuite.scala Deep-copies BlockMatrixIR before running multiple exec strategies to avoid accidental sharing/mutation.
hail/hail/test/src/is/hail/expr/ir/TableIRSuite.scala Updates MatrixNativeReader usage to the new spec accessor.
hail/hail/test/src/is/hail/expr/ir/table/TableGenSuite.scala Simplifies lowering tests to evaluate collected IR directly; adopts newer IR helpers (makestruct, MakeStream.single).
hail/hail/test/src/is/hail/expr/ir/MatrixIRSuite.scala Uses Atom.ir deep copies in test IR construction to avoid sharing.
hail/hail/test/src/is/hail/expr/ir/Aggregators2Suite.scala Refactors aggregator IR construction using higher-level helpers (insertIR, aggBindIR).
hail/hail/src/is/hail/expr/ir/Pretty.scala Changes Pretty default to preserveNames = true; adjusts SSA arg naming for MatrixMapRows.
hail/hail/src/is/hail/expr/ir/Optimize.scala Relies on new Pretty defaults (drops explicit preserveNames = true).
hail/hail/src/is/hail/expr/ir/MatrixWriter.scala Updates writer components to accept Atom bindings; rewrites several writer lowerings with Memoized and helper combinators.
hail/hail/src/is/hail/expr/ir/MatrixIR.scala Introduces MakeArray.empty; removes getSpec() in favor of spec; refactors native reader col-reading IR.
hail/hail/src/is/hail/expr/ir/LowerMatrixIR.scala Major refactor to shadow names per-scope and enforce name-normalized output via NormalizeNames.
hail/hail/src/is/hail/expr/ir/lowering/LowerTableIR.scala Updates TableStage APIs to accept Atom and reduces repeated evaluation with bindIR.
hail/hail/src/is/hail/expr/ir/lowering/LoweringPass.scala Simplifies LowerMatrixToTablePass to call the unified LowerMatrixIR entrypoint; tightens before invariant.
hail/hail/src/is/hail/expr/ir/lowering/LowerBlockMatrixIR.scala Minor cleanup using MakeStream.single.
hail/hail/src/is/hail/expr/ir/lowering/invariant/package.scala Uses new Pretty defaults for invariant failure traces.
hail/hail/src/is/hail/expr/ir/IR.scala Makes Let.apply elide empty blocks; adds MakeArray.empty(elementType).
hail/hail/src/is/hail/expr/ir/ForwardLets.scala Uses new Pretty defaults for logging.
hail/hail/src/is/hail/expr/ir/DeprecatedIRBuilder.scala Switches to BindingEnv[Type] and updates agg/scan/eval environment handling for shadowing correctness.
hail/hail/src/is/hail/expr/ir/agg/Extract.scala Fixes init-op arg binding behavior when knownLength is present.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +690 to 693
case MatrixColsHead(child, n) =>
lower(ctx, child, liftedRelationalLets)
.mapGlobals('global.insertFields('__cols -> 'global('__cols).arraySlice(0, Some(n), 1)))
.mapRows('row.insertFields(entriesField -> 'row(entriesField).arraySlice(0, Some(n), 1)))
Comment on lines +695 to 698
case MatrixColsTail(child, n) =>
lower(ctx, child, liftedRelationalLets)
.mapGlobals('global.insertFields('__cols -> 'global('__cols).arraySlice(-n, None, 1)))
.mapRows('row.insertFields(entriesField -> 'row(entriesField).arraySlice(-n, None, 1)))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

NormalizeNames at most once

2 participants