feat(catalog): add loader for security-evaluations.ndjson by pboyd · Pull Request #2779 · kubeflow/hub

pboyd · 2026-06-03T17:57:42Z

Description

Load security-evaluations.ndjson into a security-metrics metrics arifact when the performance metrics are loaded.

How Has This Been Tested?

I added some examples to the demo performance data. It can be queried, if you're running the demo overlay for tilt:

$ curl -s "http://localhost:8082/api/model_catalog/v1alpha1/sources/validated_ai_models/models/certified/analytics-forecaster-5b/artifacts?filterQuery=metricsType.string_value=%22security-metrics%22" | jq '.
items[]'
{
  "artifactType": "metrics-artifact",
  "createTimeSinceEpoch": "1780576659176",
  "customProperties": {
    "benchmark": {
      "metadataType": "MetadataStringValue",
      "string_value": "intents"
    },
    "category": {
      "metadataType": "MetadataStringValue",
      "string_value": "security"
    },
    "description": {
      "metadataType": "MetadataStringValue",
      "string_value": "Risk assessment with a context-aware custom intent typology and probes of increasing complexity. Runs as an AI Pipeline and requires a Data Science Pipelines setup."
    },
    "evaluation": {
      "metadataType": "MetadataStringValue",
      "string_value": "Context-aware vulnerability scan (Pipeline)"
    },
    "id": {
      "metadataType": "MetadataStringValue",
      "string_value": "b1c2d3e4-3001-4000-a000-000000000005"
    },
    "lower_is_better": {
      "bool_value": true,
      "metadataType": "MetadataBoolValue"
    },
    "model_id": {
      "metadataType": "MetadataStringValue",
      "string_value": "certified/analytics-forecaster-5b"
    },
    "pass": {
      "bool_value": false,
      "metadataType": "MetadataBoolValue"
    },
    "provider_id": {
      "metadataType": "MetadataStringValue",
      "string_value": "garak-kfp"
    },
    "result": {
      "double_value": 0.65,
      "metadataType": "MetadataDoubleValue"
    },
    "result_metric": {
      "metadataType": "MetadataStringValue",
      "string_value": "attack_success_rate"
    },
    "threshold": {
      "double_value": 0.3,
      "metadataType": "MetadataDoubleValue"
    }
  },
  "externalId": "b1c2d3e4-3001-4000-a000-000000000005",
  "id": "259215",
  "lastUpdateTimeSinceEpoch": "1780576659176",
  "metricsType": "security-metrics",
  "name": "security-b1c2d3e4-3001-4000-a000-000000000005"
}

Merge criteria:

All the commits have been signed-off (To pass the DCO check)

The commits have meaningful messages
Automated tests are provided as part of the PR for major new functionalities; testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
The developer has manually tested the changes and verified that the changes work.
Code changes follow the kubeflow contribution guidelines.

google-oss-prow · 2026-06-03T17:57:50Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from pboyd. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Add sample security evaluation data for three of the four certified demo models, to be consumed by the upcoming security-metrics loader. - certified/production-llm-8b: 2 records (pipeline + standalone scans, both passing) - certified/compliance-assistant-3b: 2 records (pipeline + domain-specific scans, both passing) - certified/analytics-forecaster-5b: 1 record (pipeline scan, failing — attack_success_rate: 0.65) - certified/secure-embeddings-v2: no file (embeddings models are not garak targets) The intentional absence for secure-embeddings-v2 exercises the loader's "skip silently when security-evaluations.ndjson is absent" behavior. Assisted-by: Claude Opus 4.6 Signed-off-by: Paul Boyd <paul@pboyd.io>

Extends processModelArtifactsBatch() to parse security-evaluations.ndjson files from the existing performance metrics directories, creating artifacts with metricsType: security-metrics. No new config flags or dependencies required. Assisted-by: Claude Sonnet 4.6 Signed-off-by: Paul Boyd <paul@pboyd.io>

google-oss-prow Bot requested review from Tomcli and fege June 3, 2026 17:57

google-oss-prow Bot added the size/XL label Jun 3, 2026

github-actions Bot added Area/Manifests Area/Model Catalog labels Jun 3, 2026

pboyd force-pushed the security-metrics-loader branch from 4492752 to 7fef426 Compare June 4, 2026 12:38

pboyd force-pushed the security-metrics-loader branch from 7fef426 to 7a1130c Compare June 11, 2026 13:59

pboyd force-pushed the security-metrics-loader branch from 7a1130c to 470e8d7 Compare June 11, 2026 14:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(catalog): add loader for security-evaluations.ndjson#2779

feat(catalog): add loader for security-evaluations.ndjson#2779
pboyd wants to merge 2 commits into
kubeflow:mainfrom
pboyd:security-metrics-loader

pboyd commented Jun 3, 2026 •

edited

Loading

Uh oh!

google-oss-prow Bot commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

pboyd commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

How Has This Been Tested?

Merge criteria:

Uh oh!

google-oss-prow Bot commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

pboyd commented Jun 3, 2026 •

edited

Loading