feat: self-hosted OpenAI-compatible providers (LiteLLM) for Pi by hauserkristof · Pull Request #416 · getsentry/warden

hauserkristof · 2026-06-30T11:03:01Z

Summary

Adds support for self-hosted, OpenAI-compatible LLM providers (such as a LiteLLM proxy) to Warden's Pi runtime. You can now register a named provider in warden.toml, point any model lane at it with the usual provider/model selector, and run Warden entirely against your own infrastructure — no dependency on a hosted vendor.

The provider is generic: any OpenAI-compatible endpoint works. LiteLLM is the worked example, not a hard requirement.

What's included

Config schema — [defaults.providers.<name>] blocks (base URL, API kind, optional headers, model list). Only id is required per model; Warden fills sensible defaults (name, reasoning, input, contextWindow, maxTokens, zero cost).
Runtime registration — custom providers are normalized and registered in the Pi adapter, exposed through getRuntimeProviderOptions, and threaded through every model lane (agent, auxiliary, synthesis).
Env-only credentials — API keys are resolved from the environment (apiKeyEnv → WARDEN_<NAME>_API_KEY → <NAME>_API_KEY), never from warden.toml.
Fail-fast preflight — a non-loopback provider with no resolvable key fails before any analysis, with a clear message. Loopback URLs (localhost, 127.0.0.1, ::1, bracketed [::1]) may run keyless.
Consistent across entry points — the preflight runs in the CLI, the trigger executor, scheduled workflows, and the PR workflow.
Docs — new "Self-hosted / OpenAI-compatible providers" section in config/models.mdx, plus the Pi models.json alternative.

Behavior decisions

One model drives every lane. The auxiliary lane (structured extraction, dedup, merge, fix evaluation) and the synthesis lane now inherit the resolved global default model (defaults.agent.model → defaults.model → --model → WARDEN_MODEL) when their own model is unset. Explicit [defaults.auxiliary] / [defaults.synthesis] still win.

This is important for self-hosted setups: previously, setting only defaults.model = "litellm/..." kept just the agent lane on your proxy while extraction/dedup/synthesis silently fell back to the runtime's built-in default model — which resolves against a different provider (e.g. Gemini via Google's API). That both broke runs (auth/quota errors mid-analysis) and leaked model output carrying code context off the endpoint you deliberately chose. The inheritance fix closes that gap.
Loopback exemption. Unauthenticated local endpoints are a legitimate setup, so loopback hosts are allowed without a key while every other host requires one.

Configuration example

[defaults]
runtime = "pi"
model = "litellm/my-model"   # provider-prefixed; drives all lanes

[defaults.providers.litellm]
baseUrl = "http://localhost:4000/v1"
api = "openai-completions"
# apiKeyEnv = "WARDEN_LITELLM_API_KEY"   # optional; overrides the default lookup

[[defaults.providers.litellm.models]]
id = "my-model"
# reasoning = true        # mark reasoning models
# maxTokens = 16384       # give reasoning models headroom

Then: export WARDEN_LITELLM_API_KEY=... and run Warden as usual.

Testing

Unit + integration coverage for provider normalization, key resolution, loopback detection, the auth preflight, provider forwarding through all lanes, and the new auxiliary/synthesis model inheritance (both resolution sites).
Custom-provider preflight tests are hermetic (no ambient env dependence).
Validated end-to-end against a live LiteLLM proxy: config parsing → key resolution → preflight → registration → request → response, across agent, auxiliary, and synthesis lanes, including a reasoning model.
Full suite green: pnpm lint, pnpm build, pnpm test, docs build.

Notes

v1 is OpenAI-completions only; other API shapes are out of scope.
Self-hosted models report $0 cost unless explicit costs are configured.

🤖 Generated with Claude Code

https://claude.ai/code/session_01E7ptHmVh79CqJev6WHnWNm

Adds the brainstormed design spec for registering OpenAI-compatible self-hosted endpoints (e.g. LiteLLM) as named Pi providers in warden.toml, covering all model lanes with env-based auth. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…tize Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

…ets ::1) The re-review premise was inverted: Node's WHATWG URL returns IPv6 hosts bracketed ([::1]), so the [::1] branch is the working one. Restore it and document the behavior; keep ::1 as a defensive fallback. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

- Add providers?: ProvidersConfig to SkillRunnerOptions, AuxiliaryCallOptions, VerifyFindingsOptions, PostProcessFindingsOptions, JsonOutputRepairOptions, and FixJudgeRuntimeOptions - Carry providers through every child-options construction site in analyze.ts and post-process.ts so verify/dedup lanes do not silently lose the field - Attach providerOptions: getRuntimeProviderOptions(..., { providers }) to every runAuxiliary/runSynthesis call in extract.ts, dedup.ts, json-output.ts, judge.ts, and verify.ts - Extend findSemanticDuplicates Pick type to include providers - Add focused integration test asserting providerOptions is forwarded - Fix judge.runtime-options.test.ts mock to include getRuntimeProviderOptions Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Inject env into verifyCustomProviderAuthForRun so the preflight tests no longer depend on the runner's environment (a CI-set LITELLM_API_KEY would otherwise false-pass the no-key case). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Address final-review nits: document the intentional process.env read in getRuntimeProviderOptions (preflight resolves against the same env), and include ::1 in the loopback auth note. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ry points Address PR #1 review findings for the LiteLLM custom-provider feature: - schedule.ts: scheduled workflows ran assertValidPiModelSelectors but skipped the assertCustomProviderAuth preflight that the trigger executor and CLI already perform. Add it for pi-runtime triggers so a missing remote provider key fails fast instead of at run time. - main.ts (runConfigMode): the custom-provider preflight used raw trigger.runtime, ignoring the --runtime override that the adjacent Claude auth check and the actual run already honor. Thread `options.runtime ?? trigger.runtime` so preflight matches what executes. - judge.runtime-options.test.ts: add coverage proving providerOptions flow from evaluateFix through to runAuxiliary (previously unverified). - outline.test.ts: add a repoPath case exercising the runStructuredSkillBuilderAgent branch; provider forwarding there was untested (only the runSynthesis branch was covered). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…anes The auxiliary lane (structured extraction, dedup, merge, fix evaluation) and the synthesis lane resolved their model only from [defaults.auxiliary] / [defaults.synthesis]. When those were unset, the lanes fell back to the runtime's built-in default model, which for the Pi runtime resolves against a different provider entirely (e.g. gemini via Google's API). For self-hosted custom providers this is a real defect, not just a docs gap: configuring `defaults.model = "litellm/..."` kept only the agent lane on the proxy while extraction/dedup/synthesis silently escaped to an unconfigured external provider. That both breaks (auth/quota errors mid-run) and, worse, leaks model output carrying code context off the self-hosted endpoint the user deliberately chose. Make the auxiliary and synthesis lanes fall back to the resolved global default model (defaults.agent.model -> defaults.model -> --model -> WARDEN_MODEL) when their own model is unset, in both resolution sites (CLI resolveCliDefault* and loader resolveSkillConfigs). Explicit auxiliary/synthesis models still win, and lanes still ignore skill/trigger-level overrides (agent lane only). A single configured model now drives every lane. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…lane resolveWorkflowAuxiliaryOptions read only [defaults.auxiliary].model and never fell back to the global default model, unlike resolveSkillConfigs and the CLI resolvers. In the GitHub Action PR workflow, dedup, consolidation, and fix evaluation could still hit the runtime's built-in default model on another provider while `providers` pointed at a custom (e.g. self-hosted) endpoint - the same escape the lane-inheritance fix closed for the other entry points. Add the global-model fallback (agent.model -> model) after the explicit auxiliary models, keeping the existing base-first enforced-baseline precedence. Export the resolver and unit-test the inheritance and precedence. Reported by Cursor Bugbot on PR getsentry#416. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…luation evaluateFix called getRuntime(runtimeOptions.runtime) - which defaults an omitted runtime to 'pi' - but resolved provider options with getRuntimeProviderOptions(runtimeOptions.runtime ?? 'claude', ...). On the omitted-runtime path the two disagreed: the call ran on Pi while provider options were built for Claude, so custom providers were never registered and fix evaluation silently escaped to a runtime default on another provider. Resolve one effective runtime (runtimeOptions.runtime ?? 'pi') and pass it to both getRuntime and getRuntimeProviderOptions so they can never diverge. Update the null-options test and add a test for the omitted-runtime forwarding path. Reported by Cursor Bugbot on PR getsentry#416. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

inheritRepoLayerDefaults carried the org base runtime and verification defaults into the repo layer but not defaults.providers. A repo config that only added skills therefore lost the custom providers defined by the org base config, so its resolved triggers ran without them. Inherit defaults.providers as an execution-environment default, alongside runtime. Per-skill policy defaults (model, failOn, ignorePaths, ...) still do not cross layers, matching the existing selective-inheritance contract. Reported by Cursor Bugbot on PR getsentry#416. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The generated-skill build/improve command resolved its model lanes and skipped the provider preflight differently from every other entry point, so a config that points only `defaults.model` at a custom (e.g. self-hosted) provider could silently escape to a runtime default on another provider, and a keyless remote provider failed mid-build instead of failing fast. - Synthesis model now falls back to the global default chain (synthesis -> auxiliary -> agent.model -> model -> --model -> WARDEN_MODEL) via a shared resolveDefaultModel helper, matching resolveCliDefaultSynthesisModel. - Repair model gains the same auxiliary inheritance chain instead of reading only defaults.auxiliary.model. - Add the assertCustomProviderAuth preflight (fail-fast with reporter.error) before synthesis, matching the CLI, executor, and workflow entry points. Export the resolvers and unit-test the inheritance/precedence; add an integration test that build fails fast on a keyless remote provider. Reported by Cursor Bugbot on PR getsentry#416. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

In runStructuredSkillBuilderAgent, when structured output fails validation and the primary repairStructuredSkillBuilderOutput path also fails, the secondary parseJsonFromOutput repair call omitted `providers`. The primary path forwards them, so on Pi with a self-hosted model the fallback repair ran without the registered custom providers and could fail or hit the wrong backend. Pass providers through to the fallback repair options (JsonOutputRepairOptions already supports and forwards them). Add a test covering the fallback path. Reported by Cursor Bugbot on PR getsentry#416. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit e31e6fe. Configure here.}

cursor · 2026-06-30T16:49:10Z

+  if (base?.providers !== undefined && inherited.providers === undefined) {
+    inherited.providers = base.providers;
+  }
+


Layered repo triggers miss model

High Severity

Org-level defaults.providers and defaults.model for a self-hosted lane are handled inconsistently across layers. Repo-only triggers inherit custom providers from the base config but not the base default model, so trigger execution can run auxiliary and synthesis calls without the org’s litellm/… model while providers are present. Workflow-scoped dedup and consolidate still resolve the base model via resolveWorkflowAuxiliaryOptions, so analysis and posting can disagree on which model and provider lane runs.

Additional Locations (1)

packages/warden/src/action/workflow/pr-workflow.ts#L177-L204

^{Reviewed by Cursor Bugbot for commit e31e6fe. Configure here.}

cursor · 2026-06-30T16:49:10Z

    model: options.model,
    maxTokens: 512,
    maxRetries: options.maxRetries,
+    providerOptions: getRuntimeProviderOptions(options.runtime ?? 'claude', { providers: options.providers }),


Dedup defaults drop Pi providers

Medium Severity

Semantic dedup and batch consolidate now forward providerOptions using getRuntimeProviderOptions, but still default an omitted runtime to claude for both getRuntime and the provider-options lookup. Elsewhere in this change set the effective default runtime is pi, and getRuntimeProviderOptions only builds custom provider registration for Pi. Callers that pass providers without an explicit runtime therefore hit the Claude adapter and silently omit custom provider registration.

Additional Locations (1)

packages/warden/src/output/dedup.ts#L919-L929

^{Reviewed by Cursor Bugbot for commit e31e6fe. Configure here.}

hauserkristof and others added 17 commits June 26, 2026 23:31

docs: implementation plan for custom Pi providers (LiteLLM)

5bc7bbc

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

feat(config): add custom provider schema for self-hosted LLMs

ff7edb0

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

feat(runtime): normalize custom providers and resolve keys from env

4e24bbc

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

fix(runtime): drop unreachable IPv6 branch, dedupe provider-name sani…

4263424

…tize Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

feat(runtime): expose custom providers via getRuntimeProviderOptions

43c4c16

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

feat(runtime): register custom providers in the pi adapter

fc1c463

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

feat(config): propagate custom providers to all runner entry points

c435776

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

feat(skill-builder): route build/improve through custom providers

e105556

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

feat(runtime): fail fast when a remote custom provider has no key

a1b5583

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

docs: document self-hosted OpenAI-compatible providers (LiteLLM)

4a7dbc8

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

cursor Bot reviewed Jun 30, 2026

View reviewed changes

Comment thread packages/warden/src/action/fix-evaluation/judge.ts Outdated

Comment thread packages/warden/src/action/workflow/pr-workflow.ts

Comment thread packages/warden/src/config/loader.ts

hauserkristof and others added 3 commits June 30, 2026 16:34

cursor Bot reviewed Jun 30, 2026

View reviewed changes

Comment thread packages/warden/src/cli/commands/build.ts

Comment thread packages/warden/src/cli/commands/build.ts

Comment thread packages/warden/src/cli/commands/build.ts

cursor Bot reviewed Jun 30, 2026

View reviewed changes

Comment thread packages/warden/src/skill-builder/agentic.ts

cursor Bot reviewed Jun 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat: self-hosted OpenAI-compatible providers (LiteLLM) for Pi#416

feat: self-hosted OpenAI-compatible providers (LiteLLM) for Pi#416
hauserkristof wants to merge 22 commits into
getsentry:mainfrom
hauserkristof:feat/custom-pi-provider-litellm

hauserkristof commented Jun 30, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot Jun 30, 2026

Uh oh!

cursor Bot Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Uh oh!

Conversation

hauserkristof commented Jun 30, 2026

Summary

What's included

Behavior decisions

Configuration example

Testing

Notes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot Jun 30, 2026

Choose a reason for hiding this comment

Layered repo triggers miss model

Uh oh!

cursor Bot Jun 30, 2026

Choose a reason for hiding this comment

Dedup defaults drop Pi providers

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant