Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 2 additions & 3 deletions .github/workflows/claude-orchestrator.yml
Original file line number Diff line number Diff line change
Expand Up @@ -219,9 +219,8 @@ jobs:
with:
model_id: ${{ inputs.model_id }}
bedrock_role_arn: ${{ inputs.bedrock_role_arn }}
# aws_region defaults to us-east-1; the codex executor remaps that to us-east-2,
# where GPT-5.5/5.4 are served (the mantle endpoint exists in us-east-1 but the
# models do not yet). Consumers only set model_id.
# aws_region defaults to us-east-1, where GPT-5.5/5.4 are served (also us-east-2;
# GPT-5.4 also us-west-2). Consumers only set model_id.
aws_region: ${{ inputs.aws_region }}
prompt: ${{ inputs.prompt }}
sticky_namespace: ${{ inputs.sticky_namespace }}
Expand Down
61 changes: 31 additions & 30 deletions .github/workflows/codex-executor.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,17 @@
# Codex Executor Workflow (Reusable)
#
# PURPOSE: Reviews PRs using OpenAI GPT/Codex models (GPT-5.5, GPT-5.4) served by the
# AWS **bedrock-mantle** endpoint — the OpenAI Responses API at
# https://bedrock-mantle.{region}.api.aws/v1. These models are NOT on bedrock-runtime:
# there is no InvokeModel/Converse, so the generic Bedrock executor cannot reach them.
# Maintains the same auto-updating sticky comment as the other executors.
# AWS **bedrock-mantle** endpoint via the OpenAI Responses API. These models are NOT on
# bedrock-runtime: there is no InvokeModel/Converse, so the generic Bedrock executor
# cannot reach them. Maintains the same auto-updating sticky comment as the other executors.
#
# BASE PATH (critical): bedrock-mantle serves the two OpenAI families on DIFFERENT
# OpenAI-compatible base paths on the same host. The frontier GPT-5.x / Codex models are
# served under **/openai/v1** (per the AWS launch docs); the open-weight gpt-oss-* models
# are served under **/v1**. They are mutually exclusive — verified live 2026-06-11:
# gpt-5.5/5.4 reject /v1 ("model does not support the '/v1/responses' API") and gpt-oss-120b
# rejects /openai/v1. So we pick the path from the model id (see mantle_review.py). Sending
# every model to /v1 (the pre-fix behavior) is why GPT-5.5/5.4 appeared "unavailable" (#34).
#
# AUTH: OIDC -> assumed role -> a SHORT-TERM Bedrock bearer token minted from that session
# (aws-bedrock-token-generator `provide_token()`), passed to the OpenAI SDK. The OpenAI SDK
Expand All @@ -20,11 +27,9 @@
# the whole response and looks like a 60-100s hang. We stream the Responses API and accumulate
# response.output_text.delta events. max_output_tokens does NOT cap reasoning tokens.
#
# REGION: the bedrock-mantle ENDPOINT exists in many regions including us-east-1, BUT the
# GPT-5.5/5.4 MODELS are currently served only in us-east-2 — verified live via the Models
# API (us-east-1 lists gpt-oss but no gpt-5*; us-east-2 lists openai.gpt-5.5 / openai.gpt-5.4).
# So this executor remaps the us-east-1 default to us-east-2 where the models live. GPT-5.4 is
# also offered in us-west-2. (Re-check the Models API if AWS expands GPT-5.x to us-east-1.)
# REGION: the requested aws_region is used as-is. GPT-5.5/5.4 are served in us-east-1 and
# us-east-2 (GPT-5.4 also us-west-2); gpt-oss is in all of them — so the orchestrator's
# us-east-1 default works for every model. Verified live 2026-06-11.
#
# DATA RETENTION: the Responses API defaults store=true, which retains input+output for 30
# days in-region for previous_response_id chaining. Code review is single-shot, so we send
Expand All @@ -51,7 +56,7 @@ on:
required: true
type: string
aws_region:
description: 'AWS region for the mantle endpoint. The us-east-1 default is remapped to us-east-2, where GPT-5.5/5.4 are served (the endpoint exists in us-east-1 but the models do not yet).'
description: 'AWS region for the mantle endpoint, used as-is. GPT-5.5/5.4 are served in us-east-1 and us-east-2 (GPT-5.4 also us-west-2); gpt-oss in all.'
required: false
type: string
default: 'us-east-1'
Expand Down Expand Up @@ -83,6 +88,11 @@ on:
required: false
type: string
default: 'medium'
mantle_api_path:
description: 'Override the mantle OpenAI-compat base path. Empty (default) auto-selects: /openai/v1 for GPT-5.x/Codex, /v1 for gpt-oss-*. Set only if AWS changes the routing.'
required: false
type: string
default: ''
timeout_minutes:
description: 'Job timeout in minutes. Reasoning models stream slowly; default is generous.'
required: false
Expand Down Expand Up @@ -110,23 +120,6 @@ jobs:
# Default uses the model id so different models naturally get different stickies.
STICKY_MARKER: ${{ format('<!-- dotcms-ai-review:v3:{0} -->', inputs.sticky_namespace != '' && inputs.sticky_namespace || inputs.model_id) }}
steps:
- name: Resolve mantle region
id: region
env:
REQUESTED_REGION: ${{ inputs.aws_region }}
run: |
set -euo pipefail
# The mantle endpoint exists in us-east-1, but GPT-5.5/5.4 are served only in
# us-east-2 (us-east-1 lists gpt-oss but no gpt-5*). Treat the us-east-1 default as
# "send to where the models live" so consumers only set model_id. An explicit
# us-west-2 (valid for GPT-5.4) is honored as-is.
REGION="${REQUESTED_REGION}"
if [ -z "${REGION}" ] || [ "${REGION}" = "us-east-1" ]; then
REGION="us-east-2"
fi
echo "Effective mantle region: ${REGION}"
echo "region=${REGION}" >> "$GITHUB_OUTPUT"

- uses: actions/checkout@v4
with:
fetch-depth: 1
Expand All @@ -135,7 +128,7 @@ jobs:
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ inputs.bedrock_role_arn }}
aws-region: ${{ steps.region.outputs.region }}
aws-region: ${{ inputs.aws_region }}

- name: Set up uv
uses: astral-sh/setup-uv@v6
Expand Down Expand Up @@ -220,7 +213,14 @@ jobs:
# mantle region by configure-aws-credentials, so the token is signed for that region.
token = provide_token()

client = OpenAI(base_url=f"https://bedrock-mantle.{region}.api.aws/v1", api_key=token)
# bedrock-mantle serves the two OpenAI families on different OpenAI-compatible base
# paths on the same host: frontier GPT-5.x / Codex live under /openai/v1 (per the AWS
# launch docs), open-weight gpt-oss-* under /v1. They reject each other's path, so
# pick by model id. Verified live 2026-06-11 (#34). MANTLE_API_PATH overrides if AWS
# ever unifies them. (Path includes the OpenAI-compat segment; host is mantle.)
api_path = os.environ.get("MANTLE_API_PATH") or ("/v1" if "gpt-oss" in model else "/openai/v1")
client = OpenAI(base_url=f"https://bedrock-mantle.{region}.api.aws{api_path}", api_key=token)
print(f"mantle base path: {api_path} (model: {model})", file=sys.stderr)

text_parts, usage = [], None
try:
Expand Down Expand Up @@ -348,7 +348,8 @@ jobs:
- name: Invoke bedrock-mantle (OpenAI Responses API, streaming)
id: invoke
env:
MANTLE_REGION: ${{ steps.region.outputs.region }}
MANTLE_REGION: ${{ inputs.aws_region }}
MANTLE_API_PATH: ${{ inputs.mantle_api_path }}
run: |
set -euo pipefail
# Dependencies (openai SDK + aws-bedrock-token-generator) are declared inline in the
Expand Down
5 changes: 3 additions & 2 deletions ARCHITECTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -151,11 +151,12 @@ Uses the Bedrock Converse API, which is model-family-agnostic. Maintains its own

#### 4. `codex-executor.yml` (OpenAI GPT/Codex via bedrock-mantle)

For `openai.*` models (GPT-5.5, GPT-5.4), which are **not** on bedrock-runtime — there is no `InvokeModel`/`Converse`. They are served only by the separate **bedrock-mantle** endpoint exposing the OpenAI Responses API (`https://bedrock-mantle.{region}.api.aws/v1/responses`). The executor:
For `openai.*` models (GPT-5.5, GPT-5.4), which are **not** on bedrock-runtime — there is no `InvokeModel`/`Converse`. They are served only by the separate **bedrock-mantle** endpoint exposing the OpenAI Responses API. The frontier GPT-5.x/Codex models live under `https://bedrock-mantle.{region}.api.aws/openai/v1/responses`; the open-weight `gpt-oss-*` models live under `…/v1/responses`. The executor:

- Calls mantle with the **OpenAI SDK**, authenticated by a **short-term Bedrock bearer token** minted in-process from the assumed-role session via `aws-bedrock-token-generator` (`provide_token()`). The SDK can't consume SigV4 directly, but a short-term key keeps the OIDC-only posture: it's derived from the current STS credentials (no long-lived secret), inherits the role's permissions, expires with the role session (≤1h here, ≤12h cap), is **not a stored resource** (nothing to delete), and is never written to env/disk/logs. No marketplace subscription; no long-term API key.
- **Streams** Server-Sent Events and accumulates `response.output_text.delta` chunks. Streaming is mandatory: GPT-5.x reasons before emitting, so a non-streaming call buffers and looks like a 60–100s hang.
- Remaps the orchestrator's `us-east-1` default to **us-east-2**, where GPT-5.5/5.4 are served. The mantle *endpoint* exists in us-east-1, but the *models* are not there yet (verified via the Models API: us-east-1 lists gpt-oss but no gpt-5*). GPT-5.4 also accepts an explicit us-west-2.
- **Selects the OpenAI-compat base path by model id**: `/openai/v1` for frontier GPT-5.x/Codex, `/v1` for `gpt-oss-*`. The two families reject each other's path (`400 validation_error: "does not support the '…/responses' API"`) — verified live 2026-06-11 (#34). Sending all models to `/v1` (the original behavior) is why GPT-5.5/5.4 looked unavailable. A `mantle_api_path` input overrides the auto-selection if AWS unifies the routing.
- Uses the requested region as-is. GPT-5.5/5.4 are served in us-east-1 and us-east-2 (GPT-5.4 also us-west-2) and gpt-oss in all, so the orchestrator's us-east-1 default works for every model (verified live 2026-06-11 — at v3.1.0's authoring GPT-5.x were us-east-2-only, which is why an earlier revision remapped the region; that remap has been removed).
- Sends `store: false` on each request for **zero data retention** — the Responses API otherwise defaults `store: true`, retaining input+output for 30 days in-region for `previous_response_id` chaining, which single-shot review doesn't need.
- Reuses the same `/tmp` sticky-comment helper and `sticky_namespace` input as the generic executor. Exposes `reasoning_effort` (default `medium`). Note `max_output_tokens` caps only the visible answer, **not** reasoning tokens.

Expand Down
4 changes: 2 additions & 2 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ The repository implements a reusable workflow architecture with model-aware rout
- **Claude Orchestrator** (`.github/workflows/claude-orchestrator.yml`): Lightweight wrapper that handles @claude mention detection AND routes to the appropriate executor based on `model_id`. Consumer repositories call this with `trigger_mode: interactive` or `trigger_mode: automatic`. Exactly one executor runs per call.
- **Claude Executor** (`.github/workflows/claude-executor.yml`): Execution engine for Anthropic models — runs `anthropics/claude-code-action@v1` either against the direct Anthropic API (`provider: anthropic-api`, default) or via AWS Bedrock (`provider: anthropic-bedrock`, OIDC + `use_bedrock=true`).
- **Bedrock Generic Executor** (`.github/workflows/bedrock-generic-executor.yml`): Execution engine for **any non-Anthropic Bedrock model** (Amazon Nova, Meta Llama, Mistral, Cohere, AI21). Uses the Bedrock Converse API and maintains its own sticky comment via an inlined helper (set up to `/tmp` at job start, so no cross-repo path dependency).
- **Codex Executor** (`.github/workflows/codex-executor.yml`): Execution engine for **OpenAI GPT/Codex models** (`openai.gpt-5.5`, `openai.gpt-5.4`). These are served only by the separate **bedrock-mantle** endpoint (OpenAI Responses API), not bedrock-runtime — so it calls mantle with the **OpenAI SDK** authenticated by a **short-term Bedrock bearer token** minted in-process from the OIDC-assumed-role session (`aws-bedrock-token-generator`), and streams `response.output_text.delta` events. The token is OIDC-derived (no long-lived secret, nothing to clean up, ≤1h via the role session) and never written to env/disk/logs; IAM grants `bedrock-mantle:CallWithBearerToken` scoped to `BearerTokenType=SHORT_TERM`. Streaming is mandatory (GPT-5.x reasons before emitting). Remaps the `us-east-1` default to `us-east-2`, where GPT-5.5/5.4 are served (the mantle endpoint exists in us-east-1 but the models are not there yet — verified via the Models API). Sends `store: false` for zero data retention. Reuses the same `/tmp` sticky-comment helper. See dotCMS/Infrastructure-as-code#7836.
- **Codex Executor** (`.github/workflows/codex-executor.yml`): Execution engine for **OpenAI GPT/Codex models** (`openai.gpt-5.5`, `openai.gpt-5.4`). These are served only by the separate **bedrock-mantle** endpoint (OpenAI Responses API), not bedrock-runtime — so it calls mantle with the **OpenAI SDK** authenticated by a **short-term Bedrock bearer token** minted in-process from the OIDC-assumed-role session (`aws-bedrock-token-generator`), and streams `response.output_text.delta` events. The token is OIDC-derived (no long-lived secret, nothing to clean up, ≤1h via the role session) and never written to env/disk/logs; IAM grants `bedrock-mantle:CallWithBearerToken` scoped to `BearerTokenType=SHORT_TERM`. Streaming is mandatory (GPT-5.x reasons before emitting). **Base path is model-dependent:** frontier GPT-5.x/Codex are served under `/openai/v1`, open-weight `gpt-oss-*` under `/v1` — the executor picks by model id (they reject each other's path; verified live 2026-06-11, #34). Uses the requested region as-is (GPT-5.5/5.4 are served in us-east-1 and us-east-2, GPT-5.4 also us-west-2). Sends `store: false` for zero data retention. Reuses the same `/tmp` sticky-comment helper. See dotCMS/Infrastructure-as-code#7836.
- **Deployment Guard** (`.github/workflows/deployment-guard.yml`): Reusable workflow for validating deployment changes with configurable rules. Features organization-based bypass for trusted members, file allowlist validation, image-only change detection, and comprehensive image validation (format, repository, version pattern, registry existence, anti-downgrade logic).

### Multi-model Routing (v3)
Expand All @@ -27,7 +27,7 @@ The orchestrator picks the executor by inspecting `model_id`:
| _(empty / unset)_ | `claude-executor` (`anthropic-api`)| Backward-compat default; requires `ANTHROPIC_API_KEY` secret |
| `*.anthropic.*` (e.g. `global.anthropic.claude-sonnet-4-6`) | `claude-executor` (`anthropic-bedrock`) | Requires `bedrock_role_arn` input |
| `anthropic.*` (bare) | `claude-executor` (`anthropic-bedrock`) | Requires `bedrock_role_arn` input |
| `openai.*` (e.g. `openai.gpt-5.5`, `openai.gpt-5.4`) | `codex-executor` | Requires `bedrock_role_arn`; mantle path (us-east-2) |
| `openai.*` (e.g. `openai.gpt-5.5`, `openai.gpt-5.4`) | `codex-executor` | Requires `bedrock_role_arn`; mantle `/openai/v1` (gpt-oss → `/v1`) |
| Anything else (Nova, Llama, Mistral, …) | `bedrock-generic-executor` | Requires `bedrock_role_arn` input |

The matches for the Anthropic and OpenAI families are anchored: `^([a-z]+\.)?anthropic\.` and `^([a-z]+\.)?openai\.` — so a model ID that merely contains the substring `anthropic.`/`openai.` (e.g. `us.not-anthropic.foo`) is **not** misrouted. `openai.*` is checked before the generic fallback.
Expand Down
Loading