`InsensitiveSet`: reimplement without `OrderedSet` by risicle · Pull Request #1369 · alphagov/notifications-utils

risicle · 2026-06-10T14:36:08Z

This should allow a significantly faster implementation, using modern python's order-preserving dict, storing normalized items as keys and the original items as their corresponding values.

Behaviour is a compromise between what I deem Sensible, yet following the original InsensitiveSet's behaviour closely enough to pass the unmodified test suite. For example, this implements equality comparison against plain Iterables that requires ordering to match, even though I think it's a bit silly.

Operations and reverse-operations should work, even against a plain Iterable.

The existing implementation of InsensitiveSet is a little bit odd/incomplete and has quite an inefficient implementation of __contains__ (which may be the most heavily used of its set operations). A more robust implementation of InsensitiveSet will allow us to confidently make more use of it within Template & RecipientCSV for performance improvements there.

Note the expected results in the expanded tests are flexible in the accepted intersecting un-normalised values for __or__, __ror__ and __ior__ (i.e. do they come from the left side or right side?). This is because I'd really prefer to be free to choose whatever algorithm is fastest for these operations and not have users depend heavily on the exact behaviour. Instead of this, I'm strictly declaring that un-normalised values will come from the right-hand-side of an operation and the last occurrence in a value stream containing duplicates. This means I had to make an alteration to Field.placeholder to make it prioritise the first occurrence of duplicated placeholders.

A very rudimentary test based on an expanded version of test_markdown_in_templates called in a loop showed a modest ~9% speedup in template processing. Given InsensitiveSet isn't currently very heavily used in template processing, I think this is a decent result.

The new InsensitiveSet is also heavily type-hinted.

risicle · 2026-06-10T16:01:27Z

🤔 I'm perhaps going to reconsider the "not wanting to guarantee the un-normalized results of a union" thing because the admin tests won't pass without me ensuring the left-hand-side wins.

risicle · 2026-06-11T09:56:48Z

Thinking further, it feels like the "right" thing to do is actually to always let the later value "win", because this is how .update() always works for duplicates within an .update() item-pair stream. The only way to make it consistently let the first value win would be to stick a generator into each .update() item-pair stream to do a weird reach-around check..

quis · 2026-06-11T10:33:36Z

The reason we care (and the reason why that admin test fails) is because we want the form label to match the first instance of the placeholder in the template exactly:

risicle · 2026-06-11T11:31:11Z

We could always feed the placeholders into the InsensitiveSet in reverse.

quis · 2026-06-11T11:45:14Z

I looked into it further and the problem isn’t just with unions:

InsensitiveSet(('NAME', 'name'))
>>> InsensitiveSet(['name'])

I guess Field.placeholders could do a double-reverse but I don’t love it…

risicle · 2026-06-11T13:19:55Z

Yep, the workaround I've put in is a double-reverse, which I don't love either, but remember previously we were constructing a new InsensitiveDict for each __contains__ call. This is only one additional dict construction.

risicle · 2026-06-12T10:34:18Z

We need to agree on exact desired behaviour.

risicle · 2026-06-15T09:07:30Z

Chris sez he would ideally like it to work with first and left-most un-normalised values take precedence over later and right-most ones. This makes sense in some ways but it's awkward to make a fast implementation.

this should allow a significantly faster implementation, using modern python's order-preserving dict, storing normalized items as keys and the original items as their corresponding values. behaviour is a compromise between what i deem Sensible, yet following the original InsensitiveSet's behaviour closely enough to pass the unmodified test suite. for example, this implements equality comparison against plain Iterables that requires ordering to match, even though i think it's a bit silly. operations and reverse-operations (other on LHS, other on RHS) should work, even againsta plain Iterable. non-normalised values should always be taken from the LHS in operations where duplicates are included. this is contrary to the behaviour of dict.update(...), which always prefers later values when there are duplicate keys in an item-pair iterable. so instead of using dict.update(...) we have to create our own _add_inner_pairs(...) method which will entirely skip already-present values.

risicle · 2026-06-15T15:30:29Z

Have pushed a version that is left-prioritising.

risicle marked this pull request as ready for review June 10, 2026 14:53

risicle requested a review from quis June 10, 2026 14:53

quis reviewed Jun 10, 2026

View reviewed changes

Comment thread tests/test_insensitive_dict.py

Comment thread notifications_utils/insensitive_dict.py

Comment thread notifications_utils/insensitive_dict.py

Comment thread notifications_utils/insensitive_dict.py

quis reviewed Jun 11, 2026

View reviewed changes

Comment thread notifications_utils/insensitive_dict.py Outdated

risicle force-pushed the ris-new-insensitive-set branch 3 times, most recently from 0667710 to 31f1e1b Compare June 11, 2026 13:56

risicle requested a review from quis June 11, 2026 14:00

risicle force-pushed the ris-new-insensitive-set branch from 31f1e1b to 7398c09 Compare June 11, 2026 14:13

risicle marked this pull request as draft June 12, 2026 10:33

risicle added 4 commits June 15, 2026 16:22

InsensitiveDict: use InsensitiveSet as keys return type

900a526

InsensitiveSet: expand test coverage

1969616

Bump version to 119.0.0

577d4b0

risicle force-pushed the ris-new-insensitive-set branch from 7398c09 to 577d4b0 Compare June 15, 2026 15:28

risicle marked this pull request as ready for review June 15, 2026 15:30

quis approved these changes Jun 15, 2026

View reviewed changes

risicle merged commit 9a6342b into main Jun 16, 2026
6 checks passed

risicle deleted the ris-new-insensitive-set branch June 16, 2026 14:37

This was referenced Jun 16, 2026

notifications-utils: 118.0.0 -> 119.0.0 alphagov/notifications-admin#5966

Open

notifications-utils: 118.0.0 -> 119.0.0 alphagov/notifications-api#4885

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`InsensitiveSet`: reimplement without `OrderedSet`#1369

`InsensitiveSet`: reimplement without `OrderedSet`#1369
risicle merged 4 commits into
mainfrom
ris-new-insensitive-set

risicle commented Jun 10, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

risicle commented Jun 10, 2026

Uh oh!

risicle commented Jun 11, 2026 •

edited

Loading

Uh oh!

quis commented Jun 11, 2026

Uh oh!

risicle commented Jun 11, 2026

Uh oh!

quis commented Jun 11, 2026

Uh oh!

Uh oh!

risicle commented Jun 11, 2026

Uh oh!

risicle commented Jun 12, 2026

Uh oh!

risicle commented Jun 15, 2026

Uh oh!

risicle commented Jun 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

risicle commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

risicle commented Jun 10, 2026

Uh oh!

risicle commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

quis commented Jun 11, 2026

Uh oh!

risicle commented Jun 11, 2026

Uh oh!

quis commented Jun 11, 2026

Uh oh!

Uh oh!

risicle commented Jun 11, 2026

Uh oh!

risicle commented Jun 12, 2026

Uh oh!

risicle commented Jun 15, 2026

Uh oh!

risicle commented Jun 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

risicle commented Jun 10, 2026 •

edited

Loading

risicle commented Jun 11, 2026 •

edited

Loading