Generate lineage schema from source code by jorgee · Pull Request #7159 · nextflow-io/nextflow

jorgee · 2026-05-20T07:45:26Z

This pull request introduces a new Gradle task to automatically generate a JSON Schema for the lineage model (v1beta1) in the nf-lineage module, ensuring the schema stays in sync with the model classes and their documentation. It adds the schema generation logic, updates build files to register the task, and improves model class documentation for clarity and schema accuracy.

Schema generation and build integration:

Added a new Gradle task generateLineageSchema (implemented in GenerateLineageSchemaTask.groovy) that uses the victools JSON Schema generator to produce a draft 2020-12 JSON Schema for the lineage model, outputting it to src/resources/schema/lineage-v1beta1.schema.json. The task ensures all model subtypes are included and their documentation is reflected in the schema. [1] [2] [3] [4]
Integrated the new schema generation task into the build process by updating buildSrc/build.gradle and registering the task in modules/nf-lineage/build.gradle. [1] [2] [3]

Model documentation improvements:

Updated class-level documentation in model files (Checksum.groovy, DataPath.groovy, FileOutput.groovy, Parameter.groovy, WorkflowRun.groovy) to provide clearer and more accurate descriptions, which are now included in the generated schema. [1] [2] [3] [4] [5]

Schema output and developer workflow:

Added a generated JSON Schema file (lineage-v1beta1.schema.json) to the repository, which describes the structure and documentation for all lineage model types.
Added developer guidance in LinTypeAdapterFactory.groovy to keep the schema and registered subtypes in sync, and provided comments in the build script to guide schema regeneration after model changes. [1] [2]

This automation reduces manual effort and the risk of schema drift as the model evolves.

Fix a typo in Checksum's GroovyDoc (algortihm -> algorithm), correct FileOutput's description (it models a file, not a base class), and tidy DataPath, Parameter and WorkflowRun class-level comments. Signed-off-by: jorgee <jorge.ejarque@seqera.io>

Introduce a buildSrc Gradle task (generateLineageSchema) that produces a JSON Schema draft 2020-12 document describing the wire format emitted by LinEncoder. The task loads compiled model classes via a URLClassLoader, runs victools jsonschema-generator against the LinTypeAdapterFactory subtypes, and wraps each in the {version, kind, spec} envelope used at runtime. Shared types are deduplicated via $defs/$ref; class titles and descriptions are derived from the model GroovyDoc. The generated schema is checked in at modules/nf-lineage/src/resources/schema/lineage-v1beta1.schema.json so diffs surface in PRs; regeneration is manual via './gradlew :nf-lineage:generateLineageSchema'. A validator script under specs/260519-lineage-json-schema/validate.py supports checking real .data.json files against the schema. Signed-off-by: jorgee <jorge.ejarque@seqera.io>

netlify · 2026-05-20T07:45:31Z

✅ Deploy Preview for nextflow-docs-staging canceled.

Name	Link
🔨 Latest commit	`38a2cee`
🔍 Latest deploy log	https://app.netlify.com/projects/nextflow-docs-staging/deploys/6a0d66987e4cf0000824d8ed

bentsherman · 2026-06-08T14:52:53Z

Closing for now in favor of maintaining the lineage schema manually

Since the lineage schema may be extended in the future to include other record types not used by Nextflow, not sure it's worth maintaining a 1-to-1 automation. Let's just be diligent about keeping the schema up-to-date when we add things to the runtime

jorgee added 2 commits May 19, 2026 18:05

pinin4fjords mentioned this pull request Jun 8, 2026

Lineage TaskRun record omits cache-determining inputs (module, eval, stub), so 'lineage diff' can't explain those cache misses #7202

Open

bentsherman closed this Jun 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate lineage schema from source code#7159

Generate lineage schema from source code#7159
jorgee wants to merge 2 commits into
masterfrom
generate-lineage-schema-from-source-code

jorgee commented May 20, 2026

Uh oh!

netlify Bot commented May 20, 2026 •

edited

Loading

Uh oh!

bentsherman commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jorgee commented May 20, 2026

Uh oh!

netlify Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for nextflow-docs-staging canceled.

Uh oh!

bentsherman commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

netlify Bot commented May 20, 2026 •

edited

Loading