Generate lineage schema from source code#7159
Closed
jorgee wants to merge 2 commits into
Closed
Conversation
Fix a typo in Checksum's GroovyDoc (algortihm -> algorithm), correct FileOutput's description (it models a file, not a base class), and tidy DataPath, Parameter and WorkflowRun class-level comments. Signed-off-by: jorgee <jorge.ejarque@seqera.io>
Introduce a buildSrc Gradle task (generateLineageSchema) that produces a
JSON Schema draft 2020-12 document describing the wire format emitted by
LinEncoder. The task loads compiled model classes via a URLClassLoader,
runs victools jsonschema-generator against the LinTypeAdapterFactory
subtypes, and wraps each in the {version, kind, spec} envelope used at
runtime. Shared types are deduplicated via $defs/$ref; class titles
and descriptions are derived from the model GroovyDoc.
The generated schema is checked in at
modules/nf-lineage/src/resources/schema/lineage-v1beta1.schema.json so
diffs surface in PRs; regeneration is manual via
'./gradlew :nf-lineage:generateLineageSchema'. A validator script under
specs/260519-lineage-json-schema/validate.py supports checking real
.data.json files against the schema.
Signed-off-by: jorgee <jorge.ejarque@seqera.io>
✅ Deploy Preview for nextflow-docs-staging canceled.
|
Member
|
Closing for now in favor of maintaining the lineage schema manually Since the lineage schema may be extended in the future to include other record types not used by Nextflow, not sure it's worth maintaining a 1-to-1 automation. Let's just be diligent about keeping the schema up-to-date when we add things to the runtime |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces a new Gradle task to automatically generate a JSON Schema for the lineage model (
v1beta1) in thenf-lineagemodule, ensuring the schema stays in sync with the model classes and their documentation. It adds the schema generation logic, updates build files to register the task, and improves model class documentation for clarity and schema accuracy.Schema generation and build integration:
generateLineageSchema(implemented inGenerateLineageSchemaTask.groovy) that uses the victools JSON Schema generator to produce a draft 2020-12 JSON Schema for the lineage model, outputting it tosrc/resources/schema/lineage-v1beta1.schema.json. The task ensures all model subtypes are included and their documentation is reflected in the schema. [1] [2] [3] [4]buildSrc/build.gradleand registering the task inmodules/nf-lineage/build.gradle. [1] [2] [3]Model documentation improvements:
Checksum.groovy,DataPath.groovy,FileOutput.groovy,Parameter.groovy,WorkflowRun.groovy) to provide clearer and more accurate descriptions, which are now included in the generated schema. [1] [2] [3] [4] [5]Schema output and developer workflow:
lineage-v1beta1.schema.json) to the repository, which describes the structure and documentation for all lineage model types.LinTypeAdapterFactory.groovyto keep the schema and registered subtypes in sync, and provided comments in the build script to guide schema regeneration after model changes. [1] [2]This automation reduces manual effort and the risk of schema drift as the model evolves.