aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorTom Lane <tgl@sss.pgh.pa.us>2015-04-03 00:07:29 -0400
committerTom Lane <tgl@sss.pgh.pa.us>2015-04-03 00:07:29 -0400
commitee0d06c0b9993fb06b940c21a7bfde7de8186a2e (patch)
tree4338bd4c3bb98139a036b96ff73ce8179ab1f3d3
parenta1f4ade01ca625332a43434061f1b72788cb71e2 (diff)
downloadpostgresql-ee0d06c0b9993fb06b940c21a7bfde7de8186a2e.tar.gz
postgresql-ee0d06c0b9993fb06b940c21a7bfde7de8186a2e.zip
Fix rare startup failure induced by MVCC-catalog-scans patch.
While a new backend nominally participates in sinval signaling starting from the SharedInvalBackendInit call near the top of InitPostgres, it cannot recognize sinval messages for unshared catalogs of its database until it has set up MyDatabaseId. This is not problematic for the catcache or relcache, which by definition won't have loaded any data from or about such catalogs before that point. However, commit 568d4138c646cd7c introduced a mechanism for re-using MVCC snapshots for catalog scans, and made invalidation of those depend on recognizing relevant sinval messages. So it's possible to establish a catalog snapshot to read pg_authid and pg_database, then before we set MyDatabaseId, receive sinval messages that should result in invalidating that snapshot --- but do not, because we don't realize they are for our database. This mechanism explains the intermittent buildfarm failures we've seen since commit 31eae6028eca4365. That commit was not itself at fault, but it introduced a new regression test that does reconnections concurrently with the "vacuum full pg_am" command in vacuum.sql. This allowed the pre-existing error to be exposed, given just the right timing, because we'd fail to update our information about how to access pg_am. In principle any VACUUM FULL on a system catalog could have created a similar hazard for concurrent incoming connections. Perhaps there are more subtle failure cases as well. To fix, force invalidation of the catalog snapshot as soon as we've set MyDatabaseId. Back-patch to 9.4 where the error was introduced.
-rw-r--r--src/backend/utils/init/postinit.c8
1 files changed, 8 insertions, 0 deletions
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 28243ad58f9..6e269cdfbab 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -824,6 +824,14 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
MyProc->databaseId = MyDatabaseId;
/*
+ * We established a catalog snapshot while reading pg_authid and/or
+ * pg_database; but until we have set up MyDatabaseId, we won't react to
+ * incoming sinval messages for unshared catalogs, so we won't realize it
+ * if the snapshot has been invalidated. Assume it's no good anymore.
+ */
+ InvalidateCatalogSnapshot();
+
+ /*
* Now, take a writer's lock on the database we are trying to connect to.
* If there is a concurrently running DROP DATABASE on that database, this
* will block us until it finishes (and has committed its update of