diff options
author | Andres Freund <andres@anarazel.de> | 2018-06-12 11:13:21 -0700 |
---|---|---|
committer | Andres Freund <andres@anarazel.de> | 2018-06-12 11:13:21 -0700 |
commit | 2ce64caaf793615d0f7a7ce4380e7255077d130c (patch) | |
tree | 0d283c9b5c4b25c2ef5fef6536cbba7a8724436b /src/backend/utils/cache/inval.c | |
parent | b10edaf4bb4473adc897f5643ede8c9100abddce (diff) | |
download | postgresql-2ce64caaf793615d0f7a7ce4380e7255077d130c.tar.gz postgresql-2ce64caaf793615d0f7a7ce4380e7255077d130c.zip |
Fix bugs in vacuum of shared rels, by keeping their relcache entries current.
When vacuum processes a relation it uses the corresponding relcache
entry's relfrozenxid / relminmxid as a cutoff for when to remove
tuples etc. Unfortunately for nailed relations (i.e. critical system
catalogs) bugs could frequently lead to the corresponding relcache
entry being stale.
This set of bugs could cause actual data corruption as vacuum would
potentially not remove the correct row versions, potentially reviving
them at a later point. After 699bf7d05c some corruptions in this vein
were prevented, but the additional error checks could also trigger
spuriously. Examples of such errors are:
ERROR: found xmin ... from before relfrozenxid ...
and
ERROR: found multixact ... from before relminmxid ...
To be caused by this bug the errors have to occur on system catalog
tables.
The two bugs are:
1) Invalidations for nailed relations were ignored, based on the
theory that the relcache entry for such tables doesn't
change. Which is largely true, except for fields like relfrozenxid
etc. This means that changes to relations vacuumed in other
sessions weren't picked up by already existing sessions. Luckily
autovacuum doesn't have particularly longrunning sessions.
2) For shared *and* nailed relations, the shared relcache init file
was never invalidated while running. That means that for such
tables (e.g. pg_authid, pg_database) it's not just already existing
sessions that are affected, but even new connections are as well.
That explains why the reports usually were about pg_authid et. al.
To fix 1), revalidate the rd_rel portion of a relcache entry when
invalid. This implies a bit of extra complexity to deal with
bootstrapping, but it's not too bad. The fix for 2) is simpler,
simply always remove both the shared and local init files.
Author: Andres Freund
Reviewed-By: Alvaro Herrera
Discussion:
https://postgr.es/m/20180525203736.crkbg36muzxrjj5e@alap3.anarazel.de
https://postgr.es/m/CAMa1XUhKSJd98JW4o9StWPrfS=11bPgG+_GDMxe25TvUY4Sugg@mail.gmail.com
https://postgr.es/m/CAKMFJucqbuoDRfxPDX39WhA3vJyxweRg_zDVXzncr6+5wOguWA@mail.gmail.com
https://postgr.es/m/CAGewt-ujGpMLQ09gXcUFMZaZsGJC98VXHEFbF-tpPB0fB13K+A@mail.gmail.com
Backpatch: 9.3-
Diffstat (limited to 'src/backend/utils/cache/inval.c')
-rw-r--r-- | src/backend/utils/cache/inval.c | 32 |
1 files changed, 20 insertions, 12 deletions
diff --git a/src/backend/utils/cache/inval.c b/src/backend/utils/cache/inval.c index d0e54b85352..2226b325720 100644 --- a/src/backend/utils/cache/inval.c +++ b/src/backend/utils/cache/inval.c @@ -521,12 +521,12 @@ RegisterRelcacheInvalidation(Oid dbId, Oid relId) (void) GetCurrentCommandId(true); /* - * If the relation being invalidated is one of those cached in the local - * relcache init file, mark that we need to zap that file at commit. Same - * is true when we are invalidating whole relcache. + * If the relation being invalidated is one of those cached in a relcache + * init file, mark that we need to zap that file at commit. For simplicity + * invalidations for a specific database always invalidate the shared file + * as well. Also zap when we are invalidating whole relcache. */ - if (OidIsValid(dbId) && - (RelationIdIsInInitFile(relId) || relId == InvalidOid)) + if (relId == InvalidOid || RelationIdIsInInitFile(relId)) transInvalInfo->RelcacheInitFileInval = true; } @@ -881,18 +881,26 @@ ProcessCommittedInvalidationMessages(SharedInvalidationMessage *msgs, if (RelcacheInitFileInval) { + elog(trace_recovery(DEBUG4), "removing relcache init files for database %u", + dbid); + /* - * RelationCacheInitFilePreInvalidate requires DatabasePath to be set, - * but we should not use SetDatabasePath during recovery, since it is + * RelationCacheInitFilePreInvalidate, when the invalidation message + * is for a specific database, requires DatabasePath to be set, but we + * should not use SetDatabasePath during recovery, since it is * intended to be used only once by normal backends. Hence, a quick * hack: set DatabasePath directly then unset after use. */ - DatabasePath = GetDatabasePath(dbid, tsid); - elog(trace_recovery(DEBUG4), "removing relcache init file in \"%s\"", - DatabasePath); + if (OidIsValid(dbid)) + DatabasePath = GetDatabasePath(dbid, tsid); + RelationCacheInitFilePreInvalidate(); - pfree(DatabasePath); - DatabasePath = NULL; + + if (OidIsValid(dbid)) + { + pfree(DatabasePath); + DatabasePath = NULL; + } } SendSharedInvalidMessages(msgs, nmsgs); |