Skip to content

ci(sdk-regression): full 33-SDK matrix via RUN_REGRESSION comment / dispatch with per-SDK ref overrides (PER-9772)#2322

Open
pranavz28 wants to merge 18 commits into
masterfrom
PER-9772_sdk-regression-pr-gate
Open

ci(sdk-regression): full 33-SDK matrix via RUN_REGRESSION comment / dispatch with per-SDK ref overrides (PER-9772)#2322
pranavz28 wants to merge 18 commits into
masterfrom
PER-9772_sdk-regression-pr-gate

Conversation

@pranavz28

@pranavz28 pranavz28 commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

What (PER-9772)

Make the SDK regression a usable pre-merge gate for percy/cli changes:

  • Triggers — the regression now runs only via:
    • a RUN_REGRESSION comment on a PR (write/admin authors; tests that PR's CLI branch), or
    • workflow_dispatch (defaults to CLI master).
    • The earlier run-sdk-regression label trigger from this PR's first iteration was dropped in review of the final design.
  • Full matrix — expanded from 19 to 33 SDKs: added the injection-capable SDKs that were missing from the fan-out (detox, playwright-python/java/dotnet, appium-dotnet, styleguidist) and the App Percy + remaining SDKs (appium-python/java/wd/ruby, maestro-web/app, react-native-app, tosca-dotnet, uipath, xcui-swift). percy-robotframework (archived) and percy-nightmare (downstream hangs) are excluded.
  • Per-SDK ref overrides on dispatch — new sdk_refs input, comma-separated repo@branch. Example: branch=master, sdk_refs=percy-cypress@my-fix runs the whole matrix against cli master with percy-cypress's workflow taken from my-fix; unlisted SDKs keep their matrix default ref. Overridden refs are regex-validated before flowing into the downstream dispatch. (Comment path always uses matrix defaults — issue_comment carries no inputs.)
  • Dispatch plumbing fixes — job-id lookup paginated to cover the >30-job matrix (and made non-gating); percy-tosca-dotnet/percy-uipath dispatch ci.yml (they have no test.yml); percy-storybooktest-storybook-v10.yml; percy-react-native-appstorybook-rn-ci.yml.
  • Removed the App Percy + POA Buildkite job.

Rollout fixes that made the matrix green

Getting to green surfaced 9 repos where percy-bot (the WORKFLOW_DISPATCH_ACTIONS_TOKEN identity) lacked access (now granted), plus 7 SDK-side bugs, all fixed and merged:
percy-selenium-ruby#39 + percy-appium-js#589 (CLI-from-git setup used npx percy without yarn global bin on PATH → pulled the unrelated public percy package), percy-ember#1171 (canvas test asserted attribute order), percy-protractor#734 (local _iframe_shim fallbacks drifted from the canonical sdk-utils contract of #2319), percy-selenium-python#247 (pre-existing coverage debt), percy-maestro-app#17 (no workflow_dispatch inputs declared → 422), percy-styleguidist#26 (ESM test loader mislabeled yarn-linked CJS dists).

Verification

  • Full dispatch run with the 33-SDK matrix: 28647243686 — all SDKs green (sole red row was percy-selenium-python before its fix PR ⬆️ Bump karma-rollup-preprocessor from 7.0.6 to 7.0.7 #247 merged).
  • sdk_refs parsing unit-tested (empty input → defaults; per-repo exact match, no prefix cross-match; last-wins duplicates; rel/1.2.x-style refs allowed; shell metacharacters rejected).
  • Non-default-ref dispatch mechanism verified end-to-end: percy-selenium-ruby test.yml dispatched at a branch ref with the CLI branch input (run 28662964285).

🤖 Generated with Claude Code

Make SDK regression runnable as an automatic PR check, not only via a
manual `RUN_REGRESSION` comment. Adds a `pull_request` trigger gated by
the `run-sdk-regression` label; resolves the PR head ref/sha from either
event; keeps the comment path and its write/admin permission guard intact.

Untrusted head.ref is passed via env (not interpolated into the shell) and
is still validated by the existing regex-match step before any downstream
workflow is triggered.

Part of PER-9772.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@pranavz28

Copy link
Copy Markdown
Contributor Author

RUN_regression

@pranavz28

Copy link
Copy Markdown
Contributor Author

RUN_REGRESSION

pranavz28 and others added 10 commits June 29, 2026 01:08
The same regression trigger (RUN_REGRESSION comment or run-sdk-regression
label) now also fires the App Percy + POA suites, which run on Buildkite
(real BrowserStack devices/browsers). A new trigger-app-poa job
repository_dispatches to percy/percy-automation, whose workflow creates the
Buildkite builds against this CLI branch. percy-automation remains the single
owner of App/POA-on-Buildkite; this is just the trigger.

Internal-only guard (write/admin or label) and env-based, regex-validated
branch handling mirror the web job. Requires a PERCY_AUTOMATION_DISPATCH_TOKEN
secret.

Part of PER-9772.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replace the percy-automation repository_dispatch hop with a direct Buildkite
REST call: the app-poa-regression job creates builds on the app-percy and poa
SDK regression suites (CLI built from this branch), polls them to completion,
and upserts a per-SDK pass/fail table comment on the PR.

- Direct Buildkite trigger (BUILDKITE_API_TOKEN in this repo) — no extra repo hop.
- Waits for the builds (bounded by MAX_WAIT_MIN), then posts/edits a marker
  comment with each suite's per-job result + build links; fails the job if any
  job failed/canceled.
- Internal-only guard + env-based, regex-validated branch handling unchanged.

Part of PER-9772.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Grounded against real builds of app-percy-sdk-regression-suite: the matrix
jobs are named per SDK+device (e.g. 'Python-Android [...]'), but the build
also has the bootstrap upload step ('App-Percy-SDK-tests'/'POA-SDK-tests')
and an unnamed wait job. Exclude both from the per-SDK pass/fail table and the
failure check.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
detox, playwright-python, robotframework, playwright-java, playwright-dotnet
all support CLI-branch injection in their test.yml but were missing from the
matrix, so a CLI change silently skipped them. Added as @main (their default
branch) since the split default ref is master.

Part of PER-9772.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…emand

Lets the SDK regression matrix be triggered manually (and on the PR's own
branch) against a chosen CLI branch, without a comment/label. The Buildkite
App/POA job stays comment/label-only, so a dispatch tests the web fan-out only.

Part of PER-9772.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A regression fan-out must report every SDK's result; with default fail-fast
the first SDK failure cancels all other matrix jobs, hiding the rest.

Part of PER-9772.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
storybook has no test.yml — it uses test-storybook-vN.yml — so the fan-out
silently failed to trigger it. Dispatch test-storybook-v10.yml for storybook.

Part of PER-9772.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Both now support CLI-branch injection (appium-dotnet#403, styleguidist#25).
Part of PER-9772.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds appium-python/java/wd/ruby, maestro-web/app, react-native-app,
tosca-dotnet, uipath, xcui-swift. react-native-app uses storybook-rn-ci.yml
(per-repo workflow filename). Skips puppeteer/ember (per decision) and
katalon/espresso (infeasible / needs emulator). Part of PER-9772.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Their tests assert on the PER-7348 readiness-gate contract and fail against an
ahead-of-release cli@master until they adapt + bump @percy/sdk-utils. Skip per
decision. Part of PER-9772.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@pranavz28 pranavz28 marked this pull request as ready for review June 29, 2026 11:48
@pranavz28 pranavz28 requested a review from a team as a code owner June 29, 2026 11:48
pranavz28 and others added 6 commits June 29, 2026 20:11
The fan-out matrix has grown past 30 jobs (currently 35). The
`Get Current Job Log URL` step (Tiryoh/gha-jobid-action) defaults to
per_page=30, so every job on page 2 fails to find itself, resolves
job_id to null, and exits 1 *before* dispatching the SDK workflow —
producing false reds (appium-js, maestro-app, maestro-web,
selenium-ruby) that never actually ran a regression.

Set per_page=100 (jobs API max) to cover the whole matrix, and mark the
step continue-on-error since it only feeds the commit-status target_url
and must never gate the regression itself.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Both were excluded because their tests assert on the PER-7348
readiness-gate contract and red against an ahead-of-release cli@master.
Re-adding them as-is: both have a workflow_dispatch trigger on their
default branch, so they dispatch and run. Expected to red on master
until they adapt to the two-call readiness contract + bump
@percy/sdk-utils; we'll fix the reds as they surface.

Matrix is now 36 SDKs (job-id lookup already paginated to per_page=100).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
percy-tosca-dotnet and percy-uipath have no test.yml — their @percy/cli
inject step lives in ci.yml (workflow_dispatch + branch input + "Set up
@percy/cli from git" cloning the injected branch are all present there).
The orchestrator was dispatching test.yml, so both 404'd at the trigger
step — previously mis-attributed to a token-access gap. Map both to
ci.yml in the workflow_file_name selector so the fan-out reaches their
real (correct) inject workflow.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…m matrix

percy-robotframework is archived (read-only) so workflow dispatch always
404s; percy-nightmare's downstream run hangs until the 6h job limit.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ides

Drop the run-sdk-regression label (pull_request) trigger — regression now
runs only via a RUN_REGRESSION PR comment or workflow_dispatch. Dispatch
gains an sdk_refs input (comma-separated repo@branch) to run individual
SDKs' workflows from a specific branch (e.g. CLI master + one SDK's
feature branch); unlisted SDKs keep their matrix default ref. Overridden
refs are validated before flowing into the downstream dispatch.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@pranavz28 pranavz28 changed the title ci(sdk-regression): auto-run SDK regression on PRs via run-sdk-regression label ci(sdk-regression): full 33-SDK matrix via RUN_REGRESSION comment / dispatch with per-SDK ref overrides (PER-9772) Jul 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant