diff --git a/codegen/GENERATION.md b/codegen/GENERATION.md new file mode 100644 index 00000000..a90f17f0 --- /dev/null +++ b/codegen/GENERATION.md @@ -0,0 +1,57 @@ +# JMEOS generation — the canonical per-binding generator policy + +JMEOS is a **generated** binding. This document is the contract for how it is generated, +under the ecosystem-wide per-binding generator policy. + +## The policy (ecosystem-wide) + +Every MobilityDB language binding is a **pure projection of the MEOS-API catalog**, and +**each binding owns its own generator, in its own repo**, in a canonical layout — not a +single central generator-repo. The single source of truth is the **catalog** +(`MEOS-API/output/meos-idl.json`, generated from the MEOS C headers), not a generator +location. This mirrors how MEOS itself is built: independent, plug-and-play, CMake-gated +families — a binding is likewise an independent module that owns its generation. + +Each binding repo satisfies the same invariants: + +1. **In-repo generator**, one clearly-designated location. For JMEOS that is the + `codegen/` Maven module (`codegen/src/main/java/FunctionsGenerator.java`). +2. **Own pin manifest** `tools/pin/compose-order.txt` — the canonical, dependency-ordered + fold list of the open PRs that compose this binding's 1.4 surface onto `main`. +3. **Vendored catalog**, version-pinned, read-only: `codegen/input/meos-idl.json`. +4. **Thin language projection** — language-neutral decisions (grouping, skip/classify, + portable names, shape) belong upstream in the catalog, so per-language generators do + not re-implement and drift. +5. **Full automation (North Star):** generate-then-retire toward a **zero hand-written** + surface; anything that seems irreducible is either emitted by the generator or fixed at + source in MEOS (export the symbol) — never hand-patched in the binding. + +## JMEOS scope: raw FFI ONLY + +JMEOS owns the **raw FFI** projection: `FunctionsGenerator.java` → +`jmeos-core/.../functions/GeneratedFunctions.java` plus the OO type layer. + +The `org.mobilitydb.meos.MeosOps*` **facades** and the **Spark-Connect registrar** are +**consumer** projections — they are generated *in their consumer bindings* +(MobilityFlink / MobilityKafka / MobilitySpark), not in JMEOS. Flink already carries the +facade generator (`tools/codegen_facades.py`). Keeping them out of JMEOS is what prevents +the FFI line and the facade line from diverging again. + +## Generate-then-retire — the green-CI version is the probe + +Removing hand-written code happens **little by little, never wipe-first**: + +1. build/align the generator to the canonical structure; +2. generate the full surface, build green; +3. **prove generated ⊇ hand** against the **last green-CI version** (the equivalence + probe) — suite + parity, **family by family**; +4. retire the hand registrations for that family; +5. repeat. The green-CI baseline is what catches a generated gap before it ships. + +## Pinning: this binding's catalog comes from a MobilityDB pin + +JMEOS's vendored `codegen/input/meos-idl.json` is generated from a MobilityDB +`ecosystem-pin-*` (master ⊕ the MobilityDB compose-order). That pin is the *catalog/surface* +input; JMEOS's own `tools/pin/compose-order.txt` governs *this repo's* PR accumulate. See +`tools/pin/compose-order.txt` for the current composing set and the disposition of every +open PR. diff --git a/tools/pin/compose-order.txt b/tools/pin/compose-order.txt new file mode 100644 index 00000000..9649cad1 --- /dev/null +++ b/tools/pin/compose-order.txt @@ -0,0 +1,64 @@ +# USER-APPROVED-PIN-WRITE — creating JMEOS's first pin manifest (user 2026-06-25: +# "you also need a pin for JMEOS"). New file in the JMEOS repo, NOT a mutation of +# MobilityDB's pin tooling. +# +# JMEOS pin — THE canonical, dependency-ordered fold manifest (per-binding policy). +# +# WHY JMEOS NEEDS A PIN: `main` is still JMEOS 1.3 (commit d4232a0, "JMEOS 1.3 (#9)"). +# The ENTIRE 1.4 generated surface lives in OPEN PRs — none merged. So the reviewable +# 1.4 JMEOS = current `main` ⊕ a deterministic fold of the confirmed open composing PRs, +# in THIS order. Do NOT re-derive the set/order ad hoc; edit THIS file (reviewed) instead. +# (policy: generator-per-binding-canonical-policy, binding-compose-order-manifest) +# +# SCOPE (the per-binding policy): JMEOS owns the **raw-FFI generator ONLY** +# (codegen/FunctionsGenerator.java -> jmeos-core .../functions/GeneratedFunctions.java + +# the OO type layer). The org.mobilitydb.meos `MeosOps*` facades and the Spark-Connect +# registrar are CONSUMER projections — they do NOT compose JMEOS; they live in the +# consumer bindings (MobilityFlink/Kafka/Spark). See the RELOCATE section below. +# +# Format: # role +# '?' prefix = membership/order UNCONFIRMED (subsumption vs #28 not yet proven — VERIFY +# commit-level before trusting; do not fold a '?' line blindly). base = current origin/main. + +# ── WAVE 0 — FFI SURFACE TO 1.4 (-22a catalog) ── # USER-APPROVED-PIN-WRITE (JMEOS manifest, pre-review refinement) +# #28 is the COMPLETE all-families -22a frontier (VERIFIED in GeneratedFunctions.java, 40658 lines: +# trgeometry 330, rgeo 546, cbuffer 792, npoint 507, pose 543, th3 210, quadbin 219, jsonb 576, tbigint 195). +28 feat/facade-surface-22a # the 1.4 base: full -22a regen, all families + OO layer + tests. +# +# #21/#22/#23 each add a distinct GENERATOR FEATURE on the divergent facade-line. They touch the SAME +# generator files as #28 (codegen/FunctionsGenerator.java + jmeos-core .../GeneratedFunctions.java + pom), +# so they do NOT fold linearly onto #28 — RECONCILE by UNIONING the generator features, then REGENERATING +# GeneratedFunctions from the -22a catalog (the generator is the SoT; the output is regenerated, not merged): +21 fix/free-owned-char-main # generator: free OWNED char* returns (isOwnedCharReturn). + # PROVEN absent from #28's FunctionsGenerator.java (884 lines, no isOwnedCharReturn). +22 feat/family-build-flags # generator+pom: select families via -DCBUFFER/NPOINT/POSE/RGEO/H3=OFF. + # PROVEN absent from #28's pom.xml (no family profiles/arguments). +23 feat/bump-trgeometry-post-1137 # generator: post-#1137 trgeometry C-API handling. The trgeometry + # SURFACE is already in #28 (-22a); union the generator-side delta + regen. + +# ── WAVE 1 — stream-consumer hand utilities ── +?18 feat/spatial-haversine # Haversine/PointToSegment geodesic helpers (hand utils, jmeos-core utils/spatial/). + # Per policy a stream-consumer concern — candidate to RELOCATE to the stream + # binding rather than JMEOS core. Confirm with committers. + +# ── CLEANUP ── +20 JashanReel:main # remove the duplicate root src/main/java/functions/functions.java + # (-11429/+0; canonical generated file is jmeos-core .../GeneratedFunctions.java). + # Accept on merit, rebase onto the #28 frontier. + +# ════════════════════════════════════════════════════════════════════════════════════ +# RELOCATE TO CONSUMER BINDINGS (NOT JMEOS — facade/registrar projections, per policy). +# These carry mixed FFI+facade content; only the FFI/catalog deltas (if any beyond #28) +# belong in JMEOS — the facade/registrar parts move to the consumer that owns them. +# 24 feat/bump-pin-588768d7 facade emit_*_facade.py + org.mobilitydb.meos.* + parity tooling +# 25 feat/jmeos-setset-join set-set facade helper +# 26 feat/named-surface-codegen Spark-Connect registrar (MobilitySparkConnectExtensionsGen.scala, +# generate_spark_registrar.py, named-surface) -> MobilitySpark +# 27 reconcile/jmeos-pin-12l SUPERSEDED: pure facade line on the OLDER pin-12l catalog; +# its facade generator already lives in MobilityFlink #31 +# (tools/codegen_facades.py). No content lost. +# Target homes: MobilityFlink #31 (codegen_facades.py present ✓), MobilityKafka, MobilitySpark. +# +# OBSOLETE (close): +# 8 SachaDelsaux:JMEOS_v1.3 legacy 1.3 line, CONFLICTING — superseded by the 1.4 generated frontier. +# ════════════════════════════════════════════════════════════════════════════════════ diff --git a/tools/regen-from-pin.sh b/tools/regen-from-pin.sh new file mode 100755 index 00000000..9273c2e5 --- /dev/null +++ b/tools/regen-from-pin.sh @@ -0,0 +1,26 @@ +#!/usr/bin/env bash +# regen-from-pin.sh — regenerate JMEOS (the JVM FFI binding) from the MEOS catalog and build +# the jar the JVM consumers (Spark/Flink/Kafka) bind (per GENERATION.md / codegen/GENERATION.md). +# +# Usage: tools/regen-from-pin.sh +# env: CATALOG = path to meos-idl.json produced by MEOS-API run.py (required) +# LIBMEOS = path to the all-families libmeos.so built from the same pin (for tests; optional) +# +# Invoked standalone, or by MEOS-API tools/ecosystem-generate.sh (phase 1, before the JVM consumers). +set -euo pipefail +PIN="${1:?usage: regen-from-pin.sh }" +CATALOG="${CATALOG:?set CATALOG to the meos-idl.json from MEOS-API run.py}" +HERE="$(cd "$(dirname "$0")/.." && pwd)" + +# 1. vendor the catalog (codegen/input/meos-idl.json is the generator's committed input) +cp "$CATALOG" "$HERE/codegen/input/meos-idl.json" + +# 2. run FunctionsGenerator with EXPLICIT in/out paths (the default base doubles to codegen/codegen/) +( cd "$HERE" && mvn -q -pl codegen -am compile \ + && mvn -q -pl codegen exec:java -Dexec.mainClass=FunctionsGenerator \ + -Dexec.args="codegen/input/meos-idl.json jmeos-core/src/main/java/functions/GeneratedFunctions.java" ) + +# 3. build the jar the JVM consumers bind +if [ -n "${LIBMEOS:-}" ]; then cp "$LIBMEOS" "$HERE/jmeos-core/src/libmeos.so" 2>/dev/null || true; fi +( cd "$HERE" && mvn -q -pl jmeos-core -am -DskipTests package ) +echo "[jmeos] regenerated + jar built at pin $PIN -> $HERE/jar/JMEOS.jar"