feat(datalink): GC/reclamation for pinned datalink CAS#24835
Draft
ck89119 wants to merge 19 commits into
Draft
Conversation
datalink snapshots only versioned the reference string, not the bytes of the referenced file, so historical state was not byte-reproducible after an out-of-band overwrite (issue matrixorigin#24555). Add an opt-in datalink_pin(datalink) builtin that freezes the bytes currently referenced by a datalink into an immutable content-addressed store (CAS) backed by the SHARED file service, and returns a datalink carrying ?contenthash=<sha256>. The read path serves a pinned datalink directly from the CAS (a missing object errors out instead of falling back to the live, possibly overwritten file), so historical snapshots stay byte-reproducible. Reading the SHARED service uses a plain FileService rather than GetForETL, since SHARED may be a LocalFS that does not implement ETLFileService in standalone. Default (un-pinned) datalink behavior is unchanged. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Size guard: refuse to pin objects larger than MaxBlobLen (mirroring load_file), accounting for offset/size, to avoid loading a huge object fully into memory and OOMing the CN. Streaming hash/copy is a follow-up. - Strip offset/size case-insensitively when emitting the pinned URL, so a mixed-case ?Offset=/?Size= no longer survives and wrongly re-slices the CAS object on read. - Derive Datalink.ContentHash from the parsed MoPath instead of re-parsing the query, keeping it consistent with ParseDatalink for duplicate/mixed-case contenthash params. - BVT: add the core overwrite regression — pin a value, repoint the stage to new content, and assert the pinned datalink still reads the original bytes while the live datalink reads the new bytes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…24555 # Conflicts: # pkg/sql/plan/function/function_id.go # pkg/sql/plan/function/function_id_test.go
In compose multi-CN environments the SHARED file service includes an internal directory prefix (e.g. server/data/) in its error messages, making the path differ from the CAS key used in standalone mode. Intercept ErrFileNotFound in the pinned read/stat paths and re-wrap with NewFileNotFound(d.MoPath) so the message always uses the CAS key (datalink_cas/<h2>/<hash>) regardless of how the file service stores objects internally. The BVT result file already expects this form. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Review feedback (XuPeng-SH): the content-addressed datalink store was globally addressable by raw SHA-256, making a contenthash a bearer capability — any caller knowing a hash could read pinned bytes without the original path/stage authorization, and identical bytes were deduplicated across tenants (cross-account visibility by hash). Scope every CAS object by the calling account: - CAS layout becomes datalink_cas/<accountID>/<h2>/<hash>. - The accountID is resolved from the trusted execution context (defines.GetAccountId), never from the datalink URL, so it cannot be forged by editing query params. - Reads reconstruct the key under the caller's account, so a contenthash pinned by one account is not readable by another and there is no cross-account dedup visibility. Also key the pinned-not-found error on the content hash rather than the internal CAS storage path: the path embeds the file-service layout (server/data/ prefix in compose) and the account id, both non-deterministic; the hash is what the user supplied. Tests: per-account CAS isolation, account-scoped parse, cross-account read isolation (UT), and a cross-account BVT case where a second account cannot read the sys account's pinned content. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…d writes Review feedback (XuPeng-SH): 1. datalink_pin() treated an already-pinned URL as idempotent after only validating the contenthash format, without checking the CAS object exists in the caller's account namespace. A contenthash minted by another account was returned unchanged and failed only later at read time. pinDatalink now calls CASExists for the caller's account: it returns the URL unchanged only when the blob is present, otherwise it strips the contenthash, re-reads the live source and re-pins for the current account. The re-pinned live bytes must still hash to the declared contenthash (verified before CASPut), so an unavailable version errors out rather than silently pinning different bytes. 2. Pinned datalinks read from account-scoped CAS keys, but save_file() still built a generic writer from MoPath, which for a pinned value is the internal CAS key rather than the original external URL. Datalink.NewWriter now rejects writes for contenthash datalinks: pinned CAS blobs are immutable. Tests: NewWriter/save_file pinned-write rejection, per-account idempotency, re-pin from live when the blob is absent, and re-pin rejection on changed live content (UT); plus a BVT case asserting save_file on a pinned datalink errors. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Introduce package-level function variables for sqlexec.RunTxnWithSqlContext and sqlexec.RunSql in production.go so tests can stub them with gostub. Add production_test.go covering listAccountIDs, datalinkColumns, scanColumn, liveSnapshots, refsForAccount, and DatalinkCASGCTaskMetadata, raising package coverage from 43.3% to 84.0%. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
fengttt
approved these changes
Jun 4, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What type of PR is this?
Which issue(s) this PR fixes:
issue #24743
What this PR does / why we need it:
Adds garbage collection / reclamation for the per-account, content-addressed store (CAS)
datalink_cas/<accountID>/<hash>that backs pinned datalinks (datalink_pin, #24641). Without it, pinned blobs are retained forever once their referencing datalink values are gone (rows deleted, table dropped, snapshot expired, account dropped), leaking storage in the SHARED file service.Two mechanisms:
DROP ACCOUNT cleanup — on
DROP ACCOUNT, the account's entiredatalink_cas/<accountID>/prefix is reclaimed (best-effort; a failure never aborts the drop, the sweep is the backstop).pkg/frontend/authenticate.go,pkg/datalink/cas.go.Reference-aware periodic sweep (
pkg/datalink/casgc/) — an hourly cron task that, per account, collects the content hashes referenced by live datalink columns and live snapshots, then reclaims unreferenced CAS blobs. Because the file service exposes no object mtime, the grace period is a two-pass mark-and-delete with in-process state: a blob is deleted only after staying an orphan across passes for at least GraceWindow (default 24h, configurable; sweep interval 1h). The sweep is candidate-driven (it skips snapshot scans when nothing is collectable), and a singleton Sweeper preserves the grace-window state across cron firings.Snapshot safety: CAS blobs live in the SHARED file service, not in MVCC data, so snapshots/PITR do not pin them. The sweep therefore treats hashes referenced by any live snapshot as live, preventing dangling datalink references after a restore.
Tests
pkg/datalink76.6%,pkg/datalink/casgc84.0%.func_datalink_cas_gc— pin content into a table, DROP ACCOUNT with pinned blobs present, recreate a same-named account with a clean namespace (11/11 pass).makeandmake static-checkclean.🤖 Generated with Claude Code