Skip to content

feat: core SPI for contrib leaf scans (CometScanWithPlanData)#3

Draft
schenksj wants to merge 1 commit into
mainfrom
pr/delta-A1-spi
Draft

feat: core SPI for contrib leaf scans (CometScanWithPlanData)#3
schenksj wants to merge 1 commit into
mainfrom
pr/delta-A1-spi

Conversation

@schenksj

Copy link
Copy Markdown
Owner

Fork-local draft for the /review-comet-pr loop (Delta-contrib PR split, unit A.1). Not the upstream PR — base is schenksj:main, kept current with apache/main. Tracking umbrella: apache#4366.

What this unit is

The first reviewable slice of the Delta-contrib work: a small extension contract that lets out-of-tree Comet contrib leaf scans (Delta now; Hudi/etc. later) take part in native planning without core compile-time-referencing them — the same edge-keeps-the-source-specific-code shape Iceberg already uses.

Changes

  • trait CometScanWithPlanDatasourceKey / commonData / perPartitionData, plus optional dynamicPruningFilters / withDynamicPruningFilters (for scans whose DPP filters live in a @transient field, Epic: CometNativeScan improvements (per-partition serde, cleanup, DPP, AQE DPP, V2 operator) apache/datafusion-comet#3510). CometNativeScanExec mixes it in.
  • foreachUntilCometInput now matches case _: CometLeafExec — a strict superset of the previous fixed scan list (every built-in leaf scan already extends CometLeafExec).
  • PlanDataInjector.findAllPlanData collects per-partition planning data via the trait instead of a hardcoded CometNativeScanExec match.
  • PlanDataInjector registry gains one reflective DeltaPlanDataInjector$ slot, appended only when a contrib bundled it (-Pcontrib-delta). Default builds get ClassNotFoundException → None and an unchanged list.
  • CometPlanAdaptiveDynamicPruningFilters rewrites AQE DPP filters in place for trait scans whose filters can't survive makeCopy.

What it deliberately does NOT do yet

Why it's safe / inert

With no contrib on the classpath: the leaf match is a superset of the old enumeration, the trait match catches the same CometNativeScanExec, and the reflective slot resolves to nothing. Behavior-preserving on default builds.

Verification

  • CometScanWithPlanDataSuite (new): trait-contract defaults + reflective-slot graceful absence — 2/2.
  • CometJoinSuite (native scan fusion + DPP path): 28/28.
  • spotless + scalastyle: clean.
  • No native changes in this unit.

@schenksj schenksj changed the title [Delta split A.1] Core SPI for contrib leaf scans (CometScanWithPlanData) feat: core SPI for contrib leaf scans (CometScanWithPlanData) Jun 13, 2026
Introduce a small extension contract so out-of-tree Comet contrib leaf scans
(Delta, and future Hudi/etc.) can participate in native planning without core
holding a compile-time reference to them -- mirroring the Iceberg-precedent of
keeping the data-source-specific code at the edge.

What this adds:
- `trait CometScanWithPlanData` (`sourceKey` / `commonData` / `perPartitionData`,
  plus optional `dynamicPruningFilters` / `withDynamicPruningFilters` for scans
  whose DPP filters live in a @transient field). `CometNativeScanExec` now mixes
  it in.
- `CometNativeExec.foreachUntilCometInput` matches `case _: CometLeafExec` (a
  strict superset of the previous fixed scan enumeration -- all built-in leaf
  scans already extend `CometLeafExec`), so any leaf Comet exec is recognised as
  an input boundary.
- `PlanDataInjector.findAllPlanData` collects per-partition planning data via the
  trait instead of a hardcoded `CometNativeScanExec` match.
- `PlanDataInjector`'s registry gains one reflective `DeltaPlanDataInjector$`
  slot, appended only when a contrib bundled it (`-Pcontrib-delta`). Default
  builds get a `ClassNotFoundException` -> `None` and an unchanged injectors list,
  so there is zero contrib surface at runtime.
- `CometPlanAdaptiveDynamicPruningFilters` rewrites AQE DPP filters in place for
  trait scans whose filters can't survive `makeCopy` (apache#3510).

Inert by construction: with no contrib on the classpath this is behavior-
preserving (the leaf match is a superset; the trait match catches the same
`CometNativeScanExec`; the reflective slot resolves to nothing).

Tests: `CometScanWithPlanDataSuite` (trait-contract defaults + reflective-slot
graceful absence). Verified `CometJoinSuite` (native scan fusion / DPP) stays
green.

First unit of the Delta-contrib PR split (tracking: apache#4366).
schenksj added a commit that referenced this pull request Jun 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant