fix: Start listening after schema cache load by mkleczek · Pull Request #4880 · PostgREST/postgrest

mkleczek · 2026-05-05T09:25:34Z

This change ensures PostgREST starts listening on a server socket only after it loaded the schema cache and is ready to handle requests. It is no longer going to return 503 errors during startup until the schema cache is loaded.

DISCLAIMER:
This commit was authored entirely by a human without the assistance of LLMs.

steve-chavez · 2026-05-05T16:37:58Z

Previous discussion on the motivation of the change on #4703 (comment).

steve-chavez · 2026-05-05T17:17:48Z

@mkleczek As per #4703 (comment), this would clearly benefit the case of SO_REUSEPORT given 2 PostgREST instances running.

But let's consider the scenario of a single instance managed by systemd behind a proxy:

The service restarts (for any reason, could be manual restart because somehow the schema cache failed reloading).
Right now our startup is fast (milliseconds) and we start responding with 503s. During this time clients will get the 503s with a clear error message that says we're "Retrying.." plus a Retry-After header.

With this change, now we'll not respond and clients will get a connection refused with no error message. And this state can last multiple seconds now that we wait for the scache to load.

So under this scenario, it looks like this new behavior is worse?

Thinking more what we need is zero-downtime restarts, which I guess is easier under this new behavior since we could rely on systemd socket activation?

mkleczek · 2026-05-05T18:22:32Z

So under this scenario, it looks like this new behavior is worse?

Not really.

From the point of view of the client there is not much gain from these 503 errors comparing to some connect timeout or similar. The client has to handle connection issues anyway because there are many more cases when they can happen (for example the whole server might have crashed). In case of reverse proxies in front of PostgREST (ie. always) - the client will get some 50x anyway.
The only reasonable way for the client to handle network issues is to retry. Well behaving clients will use some exponential backoff with jitter retry policy to avoid overwhelming freshly started server (ie. to avoid thundering herd).
Retry-After is not very useful because it is not reliable. What's worse: if all clients retry according to this header then... boom - thundering herd - I would even say Retry-After is more harmful than good.

steve-chavez · 2026-05-06T18:33:32Z

What's worse: if all clients retry according to this header then... boom - thundering herd - I would even say Retry-After is more harmful than good.

Thought about removing the Retry-After, but its docs say:

[...] Retry-After indicates the minimum time that the user agent is asked to wait

So it's a minimum, not exact time. I think it should be fine to be clear about this on the docs and recommend jitter.

steve-chavez · 2026-05-06T20:04:18Z

@mkleczek The direction here is good, make sure to address the comments and then we can merge this.

mkleczek · 2026-05-08T06:25:00Z

I am marking this PR as draft to address concerns related to handling schema cache loading errors during start-up.

It seems the right course of action cannot be any of these two extremes:

always return 503 during schema cache loading on startup
start listening only after successful schema cache load

The first one forces clients to handle normal conditions as errors.
The second one makes the clients unaware of errors that might happen during schema cache loading which makes diagnostics more difficult.

It seems like the best (ultimate?) startup sequence should be:

Start admin server.
Try to load schema cache once.
Start listening on main socket
If there was an error in 2, enter retry loop.

That way we achieve both:

happy path (ie. successful startup sequence) does not cause any error responses
errors are properly reported to clients

The above requires wider refactoring - today the whole schema cache loading loop is implemented in a single function without any means to introspect the state of the loading process. Clients can only trigger asynchronous schema load and wait for it to finish.
It makes it related to #4856, which in turn is a prerequisite to implement loading the schema cache using listener connection to fix #4842.

@steve-chavez WDYT?

wolfgangwalther · 2026-05-08T17:18:13Z

Start admin server.

Try to load schema cache once.

Start listening on main socket

If there was an error in 2, enter retry loop.

I wrote up two different proposals but threw them away, because I always came to the conclusion that this is the sensible thing to do.

So 👍

steve-chavez · 2026-05-08T17:33:50Z

It seems like the best (ultimate?) startup sequence should be:

Looks much better. Also 👍 from me.

laurenceisla · 2026-05-08T18:34:27Z

Try to load schema cache once.

Start listening on main socket

So between these two steps, we'd still return the connection error, however after that we'd retry and get the 503. I agree with this.

@steve-chavez Not sure if it was discussed elsewhere, but this would mean that the proposal to wait until the schema cache is loaded on startup is no longer desired, right?

steve-chavez · 2026-05-08T20:23:00Z

@laurenceisla The waiting is being discussed on #4873 (comment). #4129 won't be solved here.

mkleczek · 2026-05-12T06:08:21Z

It seems like the best (ultimate?) startup sequence should be:

Start admin server.

Try to load schema cache once.

Start listening on main socket

If there was an error in 2, enter retry loop.

That way we achieve both:

happy path (ie. successful startup sequence) does not cause any error responses

errors are properly reported to clients

Updated the code to implemented the above.

steve-chavez · 2026-05-20T17:07:53Z

It seems like the best (ultimate?) startup sequence should be:

Start admin server.

Try to load schema cache once.

Start listening on main socket

If there was an error in 2, enter retry loop.

That way we achieve both:

happy path (ie. successful startup sequence) does not cause any error responses

errors are properly reported to clients
The first one forces clients to handle normal conditions as errors.

@mkleczek While the happy path is devoid of errors, the "usual path" always has some db connections errors (while the db is coming up, this happens on docker compose), should we account for a number of retries maybe before giving 503?

Otherwise, I'm wondering if there's value in merging this independently (separate from #4703), since under a connection error we'll now force a client to handle both upstream connection refused and 503 instead of just 503.

If we agree it's not an improvement on its own, maybe we should merge it together in #4703 (which is of course great on its own). That way we avoid a change in behavior here, since on #4703 this change will be guarded by the reuse port config.

Thoughts?

steve-chavez · 2026-05-20T17:13:14Z

So from the point of view of the clients (they don't know when Postgrest was started), we have 3 alternatives:

connection refused -> 503 -> normal

connection refused -> blocked/time out -> normal

connection refused -> normal

I am not sure what value clients get from the first two options comparing to the third one.
#4703 (comment)

I remember on #4703 (comment) the third option sounded great and IIRC it was the main motivation for this PR, but under real world conditions we've seen connection refused could last forever (e.g. if the schema cache never loads due to statement_timeout), so in practice we'll always devolve to option 1.

mkleczek · 2026-05-21T11:04:50Z

So from the point of view of the clients (they don't know when Postgrest was started), we have 3 alternatives:

connection refused -> 503 -> normal

connection refused -> blocked/time out -> normal

connection refused -> normal

I am not sure what value clients get from the first two options comparing to the third one.
#4703 (comment)

I remember on #4703 (comment) the third option sounded great and IIRC it was the main motivation for this PR, but under real world conditions we've seen connection refused could last forever (e.g. if the schema cache never loads due to statement_timeout), so in practice we'll always devolve to option 1.

The problem is with SO_REUSEPORT - in this case we strictly want the new instance not to start listening until it can serve requests. But we cannot really detect if we are started as a "replacement" instance or a "fresh" instance.

The are several scenarios we have to consider, I guess:

Fresh start of PostgREST before database is available - that can happen when both are started as systemd services that don't have proper dependencies set between them. PostgREST starts first and gets errors when it tries to connect to the db.
Fresh start of PostgREST when database is available and all goes ok (ie. happy path).
SO_REUSEPORT start - eg. zero downtime upgrade.

I am now starting to think that the best strategy is the original idea of this PR: do not listen until schema cache is loaded (even in case of errors) - it handles scenarios 2 and 3 properly and in scenario 1 it makes clients receive connection refused until both db and PostgREST are ready. Which I would say is fine - diagnostics is a little more difficult (because clients always get connection refused) but not that much - logs and admin server should provide enough information.

WDYT?

steve-chavez · 2026-05-21T15:42:00Z

The problem is with SO_REUSEPORT - in this case we strictly want the new instance not to start listening until it can serve requests. But we cannot really detect if we are started as a "replacement" instance or a "fresh" instance.

Yes, that's why I mentioned above that this behavior of connection refused would only make sense with #4703, since there it can be enabled with the server-reuseport config.

Which I would say is fine - diagnostics is a little more difficult (because clients always get connection refused) but not that much - logs and admin server should provide enough information.

Yes, this would only make things harder for non reuseport cases (since connection refused can last long). There's no benefit on changing the behavior for the default case. We would cause a breaking change unnecessarily.

mkleczek · 2026-05-21T16:28:21Z

Yes, this would only make things harder for non reuseport cases (since connection refused can last long). There's no benefit on changing the behavior for the default case. We would cause a breaking change unnecessarily.

The problem with what we have currently is that clients get errors during startup even if all is fine. That's confusing and IMHO wrong. I'd say that gives us a choice:

Change it as it is right now in this PR (ie. wait for first schema cache load to finish, then listen)
Wait for schema cache to load successfully.

The first option stays compatible with what we have now (and does not improve anything in cases when initial schema cache load fails). The second option seems cleaner to me but it is not clear cut.

wolfgangwalther · 2026-05-24T22:05:20Z

Needs a rebase after 1a6ba20.

mkleczek · 2026-05-25T04:31:56Z

Needs a rebase after 1a6ba20.

The reason I didn't do refactoring first was to avoid hard to resolve conflicts. I'd be grateful if we collaborated more on PRs to make our job easier instead of harder.

mkleczek · 2026-05-25T04:41:18Z

Needs a rebase after 1a6ba20.

The reason I didn't do refactoring first was to avoid hard to resolve conflicts. I'd be grateful if we collaborated more on PRs to make our job easier instead of harder.

Rebased

wolfgangwalther · 2026-05-25T07:59:18Z

The reason I didn't do refactoring first was to avoid hard to resolve conflicts.

Same reasoning here - but with an eye on our future selves, when we need to maintain things. It's much more likely we'd like to revert this fix compared to the refactor. If we do the refactor first, then the fix, it's easy to revert. If we do it the other way around, we'd then need to be very careful at that time.

btw rebasing your changeset over it should not have been hard. It should have been as easy as:

Start the rebase on main
When hitting conflicts, checkout the conflicted file from the final commit in refactor: Simplify App.initServerSocket #4951

The result after the two commits is the same, so that part is really easy. The harder to resolve conflict, which included actually looking at the code, was the one that I did when I cherry-picked it. That's why I did it and didn't force it onto you.

steve-chavez

Looked at all the change in tests, they look fine.

All is left is resolving https://github.com/PostgREST/postgrest/pull/4880/changes#r3306654090.

This change ensures PostgREST starts listening on a server socket only after it loaded the schema cache and is ready to handle requests. It is no longer going to return 503 errors during startup until the schema cache is loaded.

mkleczek force-pushed the push-tynrmqwlwwus branch from ca17636 to 5bb48c5 Compare May 5, 2026 09:27

mkleczek requested a review from steve-chavez May 5, 2026 09:27

mkleczek mentioned this pull request May 5, 2026

add: use SO_REUSEPORT on platform supporting it #4703

Open

steve-chavez reviewed May 5, 2026

View reviewed changes

Comment thread CHANGELOG.md Outdated

steve-chavez reviewed May 5, 2026

View reviewed changes

Comment thread src/PostgREST/Admin.hs

steve-chavez reviewed May 5, 2026

View reviewed changes

Comment thread src/PostgREST/App.hs

steve-chavez requested a review from laurenceisla May 5, 2026 16:38

wolfgangwalther reviewed May 5, 2026

View reviewed changes

Comment thread test/io/test_io.py

mkleczek force-pushed the push-tynrmqwlwwus branch from 5bb48c5 to 99554cb Compare May 5, 2026 18:51

mkleczek force-pushed the push-tynrmqwlwwus branch from 99554cb to 6a927aa Compare May 7, 2026 08:45

steve-chavez mentioned this pull request May 7, 2026

A statement timeout can void the schema cache #4873

Open

mkleczek marked this pull request as draft May 8, 2026 06:05

mkleczek force-pushed the push-tynrmqwlwwus branch 2 times, most recently from f577ea6 to 3eed89b Compare May 8, 2026 13:24

mkleczek force-pushed the push-tynrmqwlwwus branch 3 times, most recently from e653c00 to 55c3e8c Compare May 12, 2026 05:01

mkleczek marked this pull request as ready for review May 12, 2026 06:07

mkleczek requested a review from steve-chavez May 12, 2026 06:07

mkleczek force-pushed the push-tynrmqwlwwus branch 3 times, most recently from 255644b to 9f49c51 Compare May 20, 2026 10:34

steve-chavez reviewed May 21, 2026

View reviewed changes

Comment thread CHANGELOG.md Outdated

mkleczek force-pushed the push-tynrmqwlwwus branch from 9f49c51 to d8883f2 Compare May 22, 2026 13:05

steve-chavez approved these changes May 22, 2026

View reviewed changes

steve-chavez requested changes May 22, 2026

View reviewed changes

mkleczek force-pushed the push-tynrmqwlwwus branch 2 times, most recently from a6895b2 to c442894 Compare May 24, 2026 18:08

mkleczek mentioned this pull request May 24, 2026

refactor: Simplify App.initServerSocket #4951

Closed

mkleczek force-pushed the push-tynrmqwlwwus branch from c442894 to feffcfd Compare May 25, 2026 04:40

steve-chavez mentioned this pull request May 25, 2026

v14: refactor: Simplify App.initServerSocket #4953

Closed

steve-chavez reviewed May 25, 2026

View reviewed changes

Comment thread test/io/test_io.py

steve-chavez mentioned this pull request May 25, 2026

Fallback schema cache reload when Listener is disabled #4956

Open

steve-chavez requested changes May 26, 2026

View reviewed changes

mkleczek force-pushed the push-tynrmqwlwwus branch 3 times, most recently from 61ed1e8 to 8297bfd Compare May 31, 2026 21:32

fix: Start listening after schema cache load

28f283d

This change ensures PostgREST starts listening on a server socket only after it loaded the schema cache and is ready to handle requests. It is no longer going to return 503 errors during startup until the schema cache is loaded.

mkleczek force-pushed the push-tynrmqwlwwus branch from 8297bfd to 28f283d Compare June 2, 2026 07:13

wolfgangwalther mentioned this pull request Jun 3, 2026

fix: Do not clear the schema cache during retries #4869

Open

Uh oh!

Conversation

mkleczek commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

steve-chavez commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

steve-chavez commented May 5, 2026

Uh oh!

mkleczek commented May 5, 2026

Uh oh!

Uh oh!

steve-chavez commented May 6, 2026

Uh oh!

steve-chavez commented May 6, 2026

Uh oh!

mkleczek commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wolfgangwalther commented May 8, 2026

Uh oh!

steve-chavez commented May 8, 2026

Uh oh!

laurenceisla commented May 8, 2026

Uh oh!

steve-chavez commented May 8, 2026

Uh oh!

mkleczek commented May 12, 2026

Uh oh!

steve-chavez commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

steve-chavez commented May 20, 2026

Uh oh!

mkleczek commented May 21, 2026

Uh oh!

steve-chavez commented May 21, 2026

Uh oh!

mkleczek commented May 21, 2026

Uh oh!

Uh oh!

wolfgangwalther commented May 24, 2026

Uh oh!

mkleczek commented May 25, 2026

Uh oh!

mkleczek commented May 25, 2026

Uh oh!

wolfgangwalther commented May 25, 2026

Uh oh!

Uh oh!

steve-chavez left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants

mkleczek commented May 5, 2026 •

edited

Loading

steve-chavez commented May 5, 2026 •

edited

Loading

mkleczek commented May 8, 2026 •

edited

Loading

steve-chavez commented May 20, 2026 •

edited

Loading