Skip to content

Entity-level filtering: API and feature flag#4831

Open
gracechen09 wants to merge 2 commits into
apache:mainfrom
gracechen09:entityfilter
Open

Entity-level filtering: API and feature flag#4831
gracechen09 wants to merge 2 commits into
apache:mainfrom
gracechen09:entityfilter

Conversation

@gracechen09

Copy link
Copy Markdown

This PR introduces the API for entity-level visibility filtering on LIST operations, which is the first part of the implementation of proposal, and has been discussed in dev mailing list thread1 and thread2.

Changes:

  • added a new request object and a new entity-filtering method with a backward-compatible default implementation
  • added feature flag

Follow-up PRs will be:

  1. implementation of filterByVisibility for RBAC authorizer
  2. implementation of filterByVisibility for OPA authorizer
  3. caller integration with filterByVisibility

Checklist

  • 🛡️ Don't disclose security issues! (contact security@apache.org)
  • 🔗 Clearly explained why the changes are needed, or linked related issues: Fixes #
  • 🧪 Added/updated tests with good coverage, or manually tested (and explained how)
  • 💡 Added comments for complex logic
  • 🧾 Updated CHANGELOG.md (if needed)
  • 📚 Updated documentation in site/content/in-dev/unreleased (if needed)

@github-project-automation github-project-automation Bot moved this to PRs In Progress in Basic Kanban Board Jun 19, 2026
@gracechen09 gracechen09 marked this pull request as ready for review June 19, 2026 04:02
Copilot AI review requested due to automatic review settings June 19, 2026 04:02

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

@flyingImer flyingImer left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for driving this direction!

I read the linked doc and I'm not sure I fully follow the problem statement. I think it merges two authorization axes that Polaris keeps separate today: discoverability and metadata access.

Discoverability under a parent is the LIST privilege. TABLE_LIST is granted on the namespace, and listTables checks exactly that before returning the children. Metadata access is a separate per-entity grant (*_READ_PROPERTIES). The two don't gate each other: not holding TABLE_LIST on a namespace doesn't stop you from loading a table you were granted directly, and holding TABLE_LIST is what lets you see the child names under that namespace.

On the two problems in the doc:

  • Discoverability: "I can access a table but can't list it without namespace LIST" looks like the model working as intended, not a gap. The grant gives you access to the entity, and enumerating names under the parent is what LIST grants, separately.
  • Visibility of unauthorized entities: I'd keep "unauthorized" (no metadata or data access) apart from "shows up in a list". A name appearing under a parent you hold LIST on is the LIST privilege doing its job, even if you can't read that entity's contents. Suppressing names from a LIST-privileged caller is a different requirement (anti-enumeration), not the current model misbehaving.

Can we re-pin the problem statements?

@gracechen09

Copy link
Copy Markdown
Author

Thanks for driving this direction!

I read the linked doc and I'm not sure I fully follow the problem statement. I think it merges two authorization axes that Polaris keeps separate today: discoverability and metadata access.

Discoverability under a parent is the LIST privilege. TABLE_LIST is granted on the namespace, and listTables checks exactly that before returning the children. Metadata access is a separate per-entity grant (*_READ_PROPERTIES). The two don't gate each other: not holding TABLE_LIST on a namespace doesn't stop you from loading a table you were granted directly, and holding TABLE_LIST is what lets you see the child names under that namespace.

On the two problems in the doc:

  • Discoverability: "I can access a table but can't list it without namespace LIST" looks like the model working as intended, not a gap. The grant gives you access to the entity, and enumerating names under the parent is what LIST grants, separately.
  • Visibility of unauthorized entities: I'd keep "unauthorized" (no metadata or data access) apart from "shows up in a list". A name appearing under a parent you hold LIST on is the LIST privilege doing its job, even if you can't read that entity's contents. Suppressing names from a LIST-privileged caller is a different requirement (anti-enumeration), not the current model misbehaving.

Can we re-pin the problem statements?

Thank you for reviewing the proposal, this is a good point! The original problem statement conflates two distinct authorization concerns, therefore I updated the proposal to remove the discoverability from the problem statement. The motivation for this proposal is now aligned with the proposed solution, which is for user who holds LIST_* privilege on a parent should only receive child entities they have access to. The other case where a user holds per-entity grants but no parent-level LIST privileges is moved to Future Work section and requires more discussion in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants