ADR: benchmark process directive for per-task performance metrics by edmundmiller · Pull Request #7146 · nextflow-io/nextflow

edmundmiller · 2026-05-14T20:23:59Z

Summary

This ADR proposes a new benchmark process directive that enables per-task performance metrics collection with support for repeated execution to reduce measurement noise. The directive reuses Nextflow's existing TraceRecord infrastructure and integrates seamlessly with the configuration system.

Overview

The benchmark directive addresses a gap in Nextflow's metrics capabilities by providing:

Per-process metrics capture: A declarative way to emit task runtime metrics (wall time, peak memory, CPU usage, I/O) to a user-specified file
Configuration-driven benchmarking: Settable via nextflow.config using withName and withLabel selectors, enabling benchmarking of existing pipelines without source modification
Repeated execution support: Optional repeats: N parameter that spawns N independent task executions and aggregates their metrics
Multiple output formats: TSV (default) and JSONL output driven by file extension
Deterministic schema: Column names match TraceRecord.FIELDS for consistency with existing trace artifacts

Key Design Decisions

Reuses existing infrastructure: Leverages TraceRecord collection through the existing bash wrapper—no new probes or wrapper changes required
Independent repeats: Each repeat is a full, isolated task execution with its own workdir and scheduling, not a loop inside the wrapper script
First-repeat outputs: For repeated tasks, only the first successful repeat's outputs are emitted downstream; other repeats' workdirs are retained for inspection
Cloud-native: Works with AWS Batch, Google Batch, Kubernetes, and remote launch directories (S3, GCS, Azure Blob) through existing path-handling code
Consistent naming: Uses Nextflow's native TraceRecord field names rather than aliasing to Snakemake conventions

Directive Syntax

// Shorthand string form
process align {
    benchmark "benchmarks/align/${sample}.tsv"
    // ...
}

// Map form with options
process align {
    benchmark file: "benchmarks/align/${sample}.jsonl", repeats: 3
    // ...
}

// Config-level usage
process {
    withName: 'ALIGN' {
        benchmark = [file: "benchmarks/align/${task.tag}.tsv", repeats: 5]
    }
}

Implementation Scope

The implementation is self-contained and minimal:

Add benchmark to the standard process directive list
Parse string shorthand and map forms in ProcessConfigBuilder
Create new Benchmark class mirroring PublishDir structure
Serialize aggregated TraceRecords to TSV/JSONL after all repeats complete
No changes to trace, report, timeline, or wrapper subsystems

Non-Goals

Aggregating benchmarks across tasks (existing trace.txt and report.html cover this)
Collecting metrics beyond TraceRecord (e.g., max_uss, max_pss are documented gaps)
Reliable benchmarking inside exec: blocks
Changes to existing trace/report/timeline infrastructure

https://claude.ai/code/session_01RW1UD9eQWkDLfR97WdegXk

Proposes a benchmark directive that emits per-task TraceRecord metrics to a user-specified TSV or JSONL file, with an optional repeats option that fans out into N independent task executions. Settable from nextflow.config so existing pipelines can be benchmarked without source changes. Signed-off-by: Claude <noreply@anthropic.com>

netlify · 2026-05-14T20:24:05Z

✅ Deploy Preview for nextflow-docs-staging canceled.

Name	Link
🔨 Latest commit	`347f8db`
🔍 Latest deploy log	https://app.netlify.com/projects/nextflow-docs-staging/deploys/6a062f62f2250000081c4767

edmundmiller requested review from bentsherman and pditommaso May 14, 2026 20:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ADR: benchmark process directive for per-task performance metrics#7146

ADR: benchmark process directive for per-task performance metrics#7146
edmundmiller wants to merge 1 commit into
masterfrom
claude/nextflow-benchmark-adr-JTtRv

edmundmiller commented May 14, 2026

Uh oh!

netlify Bot commented May 14, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

edmundmiller commented May 14, 2026

Summary

Overview

Key Design Decisions

Directive Syntax

Implementation Scope

Non-Goals

Uh oh!

netlify Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for nextflow-docs-staging canceled.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

netlify Bot commented May 14, 2026 •

edited

Loading