Skip to content

fix: avoid closing running flush object batches(cp to 4.0-dev)#24843

Open
aptend wants to merge 2 commits into
4.0-devfrom
fix/issue-24319-flushobj-release-4.0-dev
Open

fix: avoid closing running flush object batches(cp to 4.0-dev)#24843
aptend wants to merge 2 commits into
4.0-devfrom
fix/issue-24319-flushobj-release-4.0-dev

Conversation

@aptend
Copy link
Copy Markdown
Contributor

@aptend aptend commented Jun 4, 2026

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) this PR fixes:

issue #24319

What this PR does / why we need it:

When a flush object subtask fails during object sync, the parent task can abort and release sibling subtasks. Previously, that release path could close a batch while an IO worker still used it, leaving nil vectors and causing ToCNBatch to panic.

This PR moves batch ownership to the IO task when it starts, lets the parent release only batches that have not been picked up by an IO task, and releases scanned batch data immediately after WriteBatch succeeds before object Sync.

Tests:

  • go test ./pkg/vm/engine/tae/tables/jobs
  • go test ./pkg/vm/engine/tae/containers

When a flush object subtask fails during object sync, the parent task can abort and release sibling subtasks. Previously, that release path could close a batch while an IO worker still used it, leaving nil vectors and causing `ToCNBatch` to panic.

This PR moves batch ownership to the IO task when it starts, lets the parent release only batches that have not been picked up by an IO task, and releases scanned batch data immediately after `WriteBatch` succeeds before object `Sync`.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 4, 2026 08:27
@aptend aptend requested a review from XuPeng-SH as a code owner June 4, 2026 08:27
@qodo-code-review
Copy link
Copy Markdown

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR backports a fix to prevent containers.ToCNBatch panics caused by a flush-object subtask’s batch being closed while an IO worker is still using it. It does this by transferring batch ownership to the IO task at execution start and ensuring the parent only releases batches that have not been picked up.

Changes:

  • Ensure scheduled flush-object subtasks are registered in the subtasks slice before scheduling, so error paths can reliably release them.
  • Move batch ownership from the parent task to the IO task at the beginning of flushObjTask.Execute, preventing concurrent closes from the parent release path.
  • Release (close) scanned batch data immediately after WriteBatch succeeds (before Sync) to reduce memory retention while syncing.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
pkg/vm/engine/tae/tables/jobs/flushTableTail.go Makes subtask tracking/release more reliable and removes the now-obsolete done bookkeeping.
pkg/vm/engine/tae/tables/jobs/flushobj.go Transfers batch ownership safely to the IO task and adjusts release semantics to avoid closing in-use batches.

@aptend aptend changed the title fix: avoid closing running flush object batches fix: avoid closing running flush object batches(cp to 4.0-dev) Jun 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/bug Something isn't working size/S Denotes a PR that changes [10,99] lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants