Skip to content

Equality delete writer should reject invalid equality field IDs #2722

Description

@u70b3

Apache Iceberg Rust version

main

Describe the bug

EqualityDeleteWriterConfig::new currently allows invalid equality delete field ID configurations:

  • an empty equality_ids list
  • duplicate equality field IDs
  • schema-missing field IDs surfacing through the projector as a generic error instead of a clear writer configuration error

Equality delete files identify rows by one or more equality column values, so an empty equality ID list should be rejected at writer configuration time. Duplicate IDs can also produce an invalid delete schema with repeated projected columns.

To Reproduce

Construct an equality delete writer config with invalid IDs:

EqualityDeleteWriterConfig::new(vec![], schema.clone());
EqualityDeleteWriterConfig::new(vec![1, 1], schema.clone());
EqualityDeleteWriterConfig::new(vec![99], schema.clone());

Before the fix, empty and duplicate IDs are accepted. Missing IDs are rejected later with a less specific projector error.

Expected behavior

EqualityDeleteWriterConfig::new should return ErrorKind::DataInvalid for:

  • empty equality ID lists
  • duplicate equality field IDs
  • equality field IDs that do not exist in the schema

The writer should continue to rely on the existing projector logic for unsupported field types and map/list reachability rules.

Willingness to contribute

I can contribute a fix for this bug independently.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions