Skip to content

Generate lineage schema from source code#7159

Closed
jorgee wants to merge 2 commits into
masterfrom
generate-lineage-schema-from-source-code
Closed

Generate lineage schema from source code#7159
jorgee wants to merge 2 commits into
masterfrom
generate-lineage-schema-from-source-code

Conversation

@jorgee

@jorgee jorgee commented May 20, 2026

Copy link
Copy Markdown
Contributor

This pull request introduces a new Gradle task to automatically generate a JSON Schema for the lineage model (v1beta1) in the nf-lineage module, ensuring the schema stays in sync with the model classes and their documentation. It adds the schema generation logic, updates build files to register the task, and improves model class documentation for clarity and schema accuracy.

Schema generation and build integration:

  • Added a new Gradle task generateLineageSchema (implemented in GenerateLineageSchemaTask.groovy) that uses the victools JSON Schema generator to produce a draft 2020-12 JSON Schema for the lineage model, outputting it to src/resources/schema/lineage-v1beta1.schema.json. The task ensures all model subtypes are included and their documentation is reflected in the schema. [1] [2] [3] [4]
  • Integrated the new schema generation task into the build process by updating buildSrc/build.gradle and registering the task in modules/nf-lineage/build.gradle. [1] [2] [3]

Model documentation improvements:

  • Updated class-level documentation in model files (Checksum.groovy, DataPath.groovy, FileOutput.groovy, Parameter.groovy, WorkflowRun.groovy) to provide clearer and more accurate descriptions, which are now included in the generated schema. [1] [2] [3] [4] [5]

Schema output and developer workflow:

  • Added a generated JSON Schema file (lineage-v1beta1.schema.json) to the repository, which describes the structure and documentation for all lineage model types.
  • Added developer guidance in LinTypeAdapterFactory.groovy to keep the schema and registered subtypes in sync, and provided comments in the build script to guide schema regeneration after model changes. [1] [2]

This automation reduces manual effort and the risk of schema drift as the model evolves.

jorgee added 2 commits May 19, 2026 18:05
Fix a typo in Checksum's GroovyDoc (algortihm -> algorithm), correct
FileOutput's description (it models a file, not a base class), and tidy
DataPath, Parameter and WorkflowRun class-level comments.

Signed-off-by: jorgee <jorge.ejarque@seqera.io>
Introduce a buildSrc Gradle task (generateLineageSchema) that produces a
JSON Schema draft 2020-12 document describing the wire format emitted by
LinEncoder. The task loads compiled model classes via a URLClassLoader,
runs victools jsonschema-generator against the LinTypeAdapterFactory
subtypes, and wraps each in the {version, kind, spec} envelope used at
runtime. Shared types are deduplicated via $defs/$ref; class titles
and descriptions are derived from the model GroovyDoc.

The generated schema is checked in at
modules/nf-lineage/src/resources/schema/lineage-v1beta1.schema.json so
diffs surface in PRs; regeneration is manual via
'./gradlew :nf-lineage:generateLineageSchema'. A validator script under
specs/260519-lineage-json-schema/validate.py supports checking real
.data.json files against the schema.

Signed-off-by: jorgee <jorge.ejarque@seqera.io>
@netlify

netlify Bot commented May 20, 2026

Copy link
Copy Markdown

Deploy Preview for nextflow-docs-staging canceled.

Name Link
🔨 Latest commit 38a2cee
🔍 Latest deploy log https://app.netlify.com/projects/nextflow-docs-staging/deploys/6a0d66987e4cf0000824d8ed

@bentsherman

Copy link
Copy Markdown
Member

Closing for now in favor of maintaining the lineage schema manually

Since the lineage schema may be extended in the future to include other record types not used by Nextflow, not sure it's worth maintaining a 1-to-1 automation. Let's just be diligent about keeping the schema up-to-date when we add things to the runtime

@bentsherman bentsherman closed this Jun 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants