refactor: add internal-schema-cache-lock-id#4962
Conversation
81d9e4d to
0c9d5a7
Compare
| "Slow schema cache load environment PostgREST." | ||
| return { | ||
| **defaultenv, | ||
| "PGRST_INTERNAL_SCHEMA_CACHE_LOCK_ID": "1111", |
There was a problem hiding this comment.
Q: Do we really need to specify an id externally? Seems like it could just be a boolean and we generate the id internally?
There was a problem hiding this comment.
-- Unblock running first query in querySchemaCache
SELECT pg_advisory_unlock(, 0)
Ah, I guess we couldn't do the above if we generate the id internally. The idea is to run the unlocks during testing right?
There was a problem hiding this comment.
@mkleczek I think we can make this clearer and just have a boolean config by adding a wrapper SQL function.
So our schema cache queries have a sort of name id:
28/May/2026:13:05:53 -0500: tables: 50.432 ms, keydeps: 29.778 ms, rels: 8.315 ms, funcs: 48.383 ms, comprels: 5.557 ms, dreps: 2.049 ms, mhandlers: 5.745 ms, tzones: 39.676 ms
So we could have:
SELECT pgrst_scache_advisory_unlock('tables')
SELECT pgrst_scache_advisory_unlock('keydeps')
SELECT pgrst_scache_advisory_unlock('rels')
-- ...The wrapper func would have to internally translate those to hardcoded lock ids, which I think it's safe to do for our tests.
WDYT?
Also, do we really need to unlock them individually or would a pg_advisory_unlock_all be enough? If so maybe we don't need to know the ids on the tests.
|
This looks like a great idea 💯. Left some comments. |
| , dbQueryTimings = qsTime | ||
| } | ||
| (`evalStateT` 0) $ do | ||
| tabs <- sqlWithLock $ sqlTimedStmt gucTbls conf allTables |
There was a problem hiding this comment.
Maybe it would be good to add a comment here on why have added sqlWithLock here? It would help whoever touches the code here later.
There was a problem hiding this comment.
We should also be able to remove the evalStateT if we use hardcoded ids plus a wrapper as I mentioned above.
32a2995 to
0e7cf88
Compare
ec5270c to
5bc5cdb
Compare
wolfgangwalther
left a comment
There was a problem hiding this comment.
I admit, I have not followed the whole discussion about that internal schema sleep thing and the related tests, but do we really need to be able to step through the schema cache queries? Or would it be enough to somehow block any schema cache from loading for a while?
I just tested a LOCK TABLE pg_class IN ACCESS EXCLUSIVE MODE;, which works as expected - you can't select from pg_class anymore while a different transaction holds this lock. This should at least allow to queue up some schema cache reloads and see whether all of them execute.
| -- if configInternalSCLockId is set run queries step-by-step waiting for lock release before each | ||
| SQL.runSteppedTransaction @SchemaCacheLabel configInternalSCLockId $ |
There was a problem hiding this comment.
So we are changing the way PostgREST operates depending on whether we are in a test-case or not?
Hard no from me. We really want to test the actual code we're running in production and not something that is just running that way in tests.
There was a problem hiding this comment.
So we are changing the way PostgREST operates depending on whether we are in a test-case or not?
We already have it: internal-schema-cache-query-sleep.
This PR goal is to replace it with much more capable internal-schema-cache-lock-id
Hard no from me. We really want to test the actual code we're running in production and not something that is just running that way in tests.
That's a noble goal but some tests - especially concurrency related - cannot be really written without certain features available (eg. suspending computation at specific moments in time etc.). If you are able to provide such facilities without cooperation from the program, I am all for using them. So far nobody came up with anything like that.
See this discussion: https://github.com/PostgREST/postgrest/pull/4880/changes#r3228963615 This is the reason why we even discuss this PR :)
This is too coarse grained, I think. The issue is that we want to be able to trigger a schema change in a specific moments during already running schema reloading process. This triggered some thinking and going into the rabbit hole... I would say it would be useful if we had a debugging facility allowing us to perform step-by-step execution of sql statements, as long as it does not require very intrusive changes in the code. |
After looking at the thread, it looks like we only need a way to block towards the end of the schema cache reload, right? I assume that PostgREST loads the schema cache in a single transaction in order, so timezones - if enabled - always come last? In that case an exclusive access lock on |
I did consider doing it that way but I don't really like it because:
I'd say that using access locks on catalog tables is in reality a hack providing a "poor man" step-by-step query execution. This PR is an attempt to provide this kind of infrastructure. This infrastructure (in the new |
That's true, but I consider a test that is bound to internals better than a test that requires changing internals.
I understood that this deficiency would be covered by (additionally) checking the explicit number of schema cache reloads that happened, right? So the thing we're really trying to do is to get rid of that internal sleep thing. Which the locking would do.
Sure, I agree on that. Tests are full of hacks, though - and it just doesn't help to avoid hacks, but to then not run the actual code that runs in production. Now if we had tons of tests that make use of this infrastructure, it could make sense to think about how to implement it in a mostly transparent way - but we're talking about a single test here and I have not heard about any other ideas or requirements for other tests in this area. I might have missed that part, though. |
0bbf723 to
57ddd27
Compare
| @@ -0,0 +1,201 @@ | |||
| {-# LANGUAGE AllowAmbiguousTypes #-} | |||
There was a problem hiding this comment.
Just looked at the code in this file and it's quite a lot. Is this really needed for this new test machinery?
There was a problem hiding this comment.
It is needed to have type level labeling of statements, like:
tabs <- SQL.statement @Tables conf allTables
keyDeps <- SQL.statement @KeyDependencies conf allViewsKeyDependencies
m2oRels <- SQL.statement @Relationships mempty allM2OandO2ORels
Label in the signature of SQL.statement is not present in any of the parameter types:
statement :: forall {k} (label :: k) m a b. (SqlTransaction k m, LabelConstraint k label) => a -> SQL.Statement a b -> m b - so the type of the label cannot be inferred from the arguments alone - hence AllowAmbiguousTypes.
I want to the statements to be labeled like this so that the labels are stable even if you change the underlying mechanism to execute them.
For example: you can add individual statement timing and step-by-step execution to any transaction transparently - to validate the concept I've implemented two PoCs:
- step-by-step execution of queries during request handling
- individual query timings added to
Server-Timingheader
Bear with me - I am still working on it and trying to come up with the best (ie. most ergonomic and stable) API. So it might change.
57ddd27 to
38137e9
Compare
38137e9 to
2766b62
Compare
This change adds possibility of step-by-step execution of SQL statements in
querySchemaCachefunction.If a new
internal-schema-cache-lock-idconfig is set (to an integer value),querySchemaCachewill executeSELECT pg_advisory_xact_lock(<internal-schema-cache-lock-id>, <next_lock_number>)before each query (next_lock_numberstarts from0and is incremented after each statement).That way it is possible to control the execution of queries in
querySchemaCache, eg:In a database session:
Trigger schema cache loading...
In the database session:
Run some actions before
querySchemaCacheexecutes second query.... etc.