Skip to content

chore(glue-sync): Ignore EntityNotFoundException when dropping Glue partitions#19142

Merged
danny0405 merged 3 commits into
apache:masterfrom
wangxianghu:glue-drop-partitions-ignore-entity-not-found
Jul 3, 2026
Merged

chore(glue-sync): Ignore EntityNotFoundException when dropping Glue partitions#19142
danny0405 merged 3 commits into
apache:masterfrom
wangxianghu:glue-drop-partitions-ignore-entity-not-found

Conversation

@wangxianghu

@wangxianghu wangxianghu commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Describe the issue this Pull Request addresses

AWSGlueCatalogSyncClient#dropPartitionsInternal previously failed the whole sync whenever BatchDeletePartition returned any error, including EntityNotFoundException for partitions that no longer exist. Dropping a non-existent partition is a no-op for this idempotent cleanup and is easily triggered by sync retries, concurrent writers, or externally-removed partitions.

Summary and Changelog

AWSGlueCatalogSyncClient#dropPartitionsInternal previously failed the entire sync whenever BatchDeletePartition returned any error in the response, including EntityNotFoundException for partitions that no longer exist.

Dropping a non-existent partition is a no-op for this idempotent cleanup and is easily triggered in practice by sync retries, concurrent writers dropping the same partitions, or partitions removed externally.

Changes:

  • Filter out EntityNotFoundException errors returned by BatchDeletePartition;
  • other errors (e.g. permission / throttling) still raise HoodieGlueSyncException.
  • Log the ignored (non-existent) partition values at INFO level so the skip is observable.
  • Added unit tests testDropPartitions_IgnoresEntityNotFound and testDropPartitions_MixedErrorsStillThrow.

Impact

none

Risk Level

none

Documentation Update

none

Contributor's checklist

  • Read through contributor's guide
  • Enough context is provided in the sections above
  • Adequate tests were added if applicable

@hudi-agent hudi-agent left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ 🤖 This review was generated by an AI agent and may contain mistakes. Please verify any suggestions before applying.

Thanks for the contribution! This PR makes AWSGlueCatalogSyncClient#dropPartitionsInternal tolerant of EntityNotFoundException errors returned by Glue's BatchDeletePartition, treating already-gone partitions as a no-op while still failing on other errors. I verified the error-code string against the AWS Glue API docs (per-partition ErrorDetail.ErrorCode is "EntityNotFoundException"), traced the partitioning/throw logic and the idempotency semantics, and confirmed batching and imports are unaffected. No correctness issues found. A few style/readability suggestions in the inline comments. Please take a look, and this should be ready for a Hudi committer or PMC member to take it from here. A couple of small naming and readability suggestions below.

cc @yihua

Comment thread hudi-aws/src/main/java/org/apache/hudi/aws/sync/AWSGlueCatalogSyncClient.java Outdated
Comment thread hudi-aws/src/main/java/org/apache/hudi/aws/sync/AWSGlueCatalogSyncClient.java Outdated
@wangxianghu wangxianghu force-pushed the glue-drop-partitions-ignore-entity-not-found branch from 6c4c12a to 52b0e23 Compare July 2, 2026 09:11
@wangxianghu wangxianghu changed the title feat(glue-sync): Ignore EntityNotFoundException when dropping Glue partitions chore(glue-sync): Ignore EntityNotFoundException when dropping Glue partitions Jul 2, 2026
@github-actions github-actions Bot added the size:S PR with lines of changes in (10, 100] label Jul 2, 2026

@hudi-agent hudi-agent left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ 🤖 This review was generated by an AI agent and may contain mistakes. Please verify any suggestions before applying.

Thanks for working on this! This PR makes Glue partition drops idempotent by filtering out per-partition EntityNotFoundException errors from BatchDeletePartition (logging them at INFO) while still raising HoodieGlueSyncException for other errors. The change mirrors the existing AlreadyExistsException handling in addPartitionsToTableInternal, the partitioningBy split is null-safe, and the round-1 nits (constant extraction and ignorableErrors naming) look addressed. No correctness issues found. A few style/readability suggestions in the inline comments. Please take a look, and this should be ready for a Hudi committer or PMC member to take it from here. One small naming suggestion on the error-code constant; the rest of the change is clean and well-commented.

cc @yihua

Comment thread hudi-aws/src/main/java/org/apache/hudi/aws/sync/AWSGlueCatalogSyncClient.java Outdated

@hudi-agent hudi-agent left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ 🤖 This review was generated by an AI agent and may contain mistakes. Please verify any suggestions before applying.

Thanks for the contribution! This PR makes Glue partition drops tolerant of EntityNotFoundException so idempotent cleanup no longer fails the entire sync when a partition is already gone, while still surfacing other errors such as permission and throttling failures. The partitioning logic is null-safe on errorCode(), treats unexpected error shapes as real errors (fail-closed), and the semantic interpretation of a per-partition EntityNotFoundException is correct. No issues flagged from this automated pass - a Hudi committer or PMC member can take it from here for a final review.

cc @yihua

@hudi-bot

hudi-bot commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@danny0405 danny0405 merged commit 667626c into apache:master Jul 3, 2026
73 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:S PR with lines of changes in (10, 100]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants