Skip to content

Fix compact backup labels and sharded compact options#6919

Open
RidRisR wants to merge 5 commits into
pingcap:release-1.xfrom
RidRisR:backport/compact-backup-labels-release-1.x
Open

Fix compact backup labels and sharded compact options#6919
RidRisR wants to merge 5 commits into
pingcap:release-1.xfrom
RidRisR:backport/compact-backup-labels-release-1.x

Conversation

@RidRisR

@RidRisR RidRisR commented May 28, 2026

Copy link
Copy Markdown
Contributor

What changed

  • Add compact-backup-specific label keys and label builder helpers.
  • Use CompactBackup instance labels when creating compact Jobs and Pods.
  • Stop labeling compact Jobs with backup-specific labels.
  • Align compact backup label values with the regional CR naming by using compactbackup.
  • Add physicalFileCacheCapacity and name fields to CompactSpec for sharded compact configuration.
  • Pass configured physicalFileCacheCapacity to tikv-ctl as --physical-file-cache-capacity instead of using the hard-coded value.
  • Default omitted or empty physicalFileCacheCapacity to 0 in backup-manager options.
  • Validate provided physicalFileCacheCapacity values as Kubernetes quantities and reject negative values while allowing 0.
  • Pass configured name to tikv-ctl as --name only when it is set.
  • Keep passing --crr-checkpoint-prefix crr-checkpoint for every sharded compact job, including when --until is set explicitly.
  • Stop passing --replication-status-sub-prefix=crr-checkpoint to BR for replication restore; newer BR resolves the checkpoint state itself.
  • Update CRDs, OpenAPI, API reference docs, and regression tests.

Why

CompactBackup Jobs used a hard-coded app.kubernetes.io/instance value and backup-specific labels. Anti-affinity selectors that target the actual CompactBackup or BackupSchedule instance could not match compact Pods, allowing multiple compact Pods to schedule onto the same node.

Sharded compact also needs user-facing YAML configuration for tikv-ctl --physical-file-cache-capacity and --name. physicalFileCacheCapacity now has a friendlier default: users may omit it and backup-manager will pass 0, while invalid or negative explicit values are still rejected.

For replication restore, newer BR can parse crr-checkpoint/resume-state.json directly, so the operator no longer needs to pass the replication status sub-prefix. Sharded compact still needs the checkpoint prefix for tikv-ctl to locate the replication checkpoint state, even when an explicit EndTs produces --until.

Validation

  • GOCACHE=/tmp/tidb-operator-go-build go test ./pkg/controller/compactbackup ./cmd/backup-manager/app/compact/options -count=1
  • go test ./cmd/backup-manager/app/restore ./cmd/backup-manager/app/compact ./pkg/backup/restore
  • rg -n "physicalFileCacheCapacity.*required when mode is sharded|required when mode is sharded" docs/api-references/docs.md pkg/apis/pingcap/v1alpha1 cmd/backup-manager/app/compact pkg/controller/compactbackup -S
  • git diff --check a650aa3ce1e66bb28d76e75fd88d6daef569ec91...HEAD

@ti-chi-bot

ti-chi-bot Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@ti-chi-bot

ti-chi-bot Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign dragonly for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot Bot added the size/M label May 28, 2026
@RidRisR RidRisR marked this pull request as ready for review May 28, 2026 07:52
@ti-chi-bot ti-chi-bot Bot added size/XXL and removed size/M labels May 29, 2026
@RidRisR RidRisR changed the title Fix compact backup job labels Fix compact backup labels and sharded tikv-ctl args May 29, 2026
@RidRisR RidRisR changed the title Fix compact backup labels and sharded tikv-ctl args Fix compact backup labels and sharded compact options Jun 5, 2026
@ti-chi-bot

ti-chi-bot Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

@RidRisR: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-e2e-kind-scale-simultaneously 46ebe7e link false /test pull-e2e-kind-scale-simultaneously
pull-e2e-kind-basic 46ebe7e link false /test pull-e2e-kind-basic
pull-e2e-kind-dmcluster 46ebe7e link false /test pull-e2e-kind-dmcluster
pull-e2e-kind-tngm 46ebe7e link false /test pull-e2e-kind-tngm
pull-e2e-kind-serial 46ebe7e link false /test pull-e2e-kind-serial

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant