diff options
author | Andres Freund <andres@anarazel.de> | 2022-04-06 21:29:46 -0700 |
---|---|---|
committer | Andres Freund <andres@anarazel.de> | 2022-04-06 21:29:46 -0700 |
commit | 5891c7a8ed8f2d3d577e7eea34dacff12d7b6bbd (patch) | |
tree | 909f20fa511d5fde6463c58403bb82508c35cfab /src/backend/access/transam/xlog.c | |
parent | be902e26510788c70a874ea54bad753b723d018f (diff) | |
download | postgresql-5891c7a8ed8f2d3d577e7eea34dacff12d7b6bbd.tar.gz postgresql-5891c7a8ed8f2d3d577e7eea34dacff12d7b6bbd.zip |
pgstat: store statistics in shared memory.
Previously the statistics collector received statistics updates via UDP and
shared statistics data by writing them out to temporary files regularly. These
files can reach tens of megabytes and are written out up to twice a
second. This has repeatedly prevented us from adding additional useful
statistics.
Now statistics are stored in shared memory. Statistics for variable-numbered
objects are stored in a dshash hashtable (backed by dynamic shared
memory). Fixed-numbered stats are stored in plain shared memory.
The header for pgstat.c contains an overview of the architecture.
The stats collector is not needed anymore, remove it.
By utilizing the transactional statistics drop infrastructure introduced in a
prior commit statistics entries cannot "leak" anymore. Previously leaked
statistics were dropped by pgstat_vacuum_stat(), called from [auto-]vacuum. On
systems with many small relations pgstat_vacuum_stat() could be quite
expensive.
Now that replicas drop statistics entries for dropped objects, it is not
necessary anymore to reset stats when starting from a cleanly shut down
replica.
Subsequent commits will perform some further code cleanup, adapt docs and add
tests.
Bumps PGSTAT_FILE_FORMAT_ID.
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Reviewed-By: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-By: "David G. Johnston" <david.g.johnston@gmail.com>
Reviewed-By: Tomas Vondra <tomas.vondra@2ndquadrant.com> (in a much earlier version)
Reviewed-By: Arthur Zakirov <a.zakirov@postgrespro.ru> (in a much earlier version)
Reviewed-By: Antonin Houska <ah@cybertec.at> (in a much earlier version)
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
Discussion: https://postgr.es/m/20220308205351.2xcn6k4x5yivcxyd@alap3.anarazel.de
Discussion: https://postgr.es/m/20210319235115.y3wz7hpnnrshdyv6@alap3.anarazel.de
Diffstat (limited to 'src/backend/access/transam/xlog.c')
-rw-r--r-- | src/backend/access/transam/xlog.c | 39 |
1 files changed, 27 insertions, 12 deletions
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 8ae0a0ba537..c076e48445d 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -1842,7 +1842,7 @@ AdvanceXLInsertBuffer(XLogRecPtr upto, TimeLineID tli, bool opportunistic) WriteRqst.Flush = 0; XLogWrite(WriteRqst, tli, false); LWLockRelease(WALWriteLock); - WalStats.m_wal_buffers_full++; + PendingWalStats.wal_buffers_full++; TRACE_POSTGRESQL_WAL_BUFFER_WRITE_DIRTY_DONE(); } /* Re-acquire WALBufMappingLock and retry */ @@ -2200,10 +2200,10 @@ XLogWrite(XLogwrtRqst WriteRqst, TimeLineID tli, bool flexible) INSTR_TIME_SET_CURRENT(duration); INSTR_TIME_SUBTRACT(duration, start); - WalStats.m_wal_write_time += INSTR_TIME_GET_MICROSEC(duration); + PendingWalStats.wal_write_time += INSTR_TIME_GET_MICROSEC(duration); } - WalStats.m_wal_write++; + PendingWalStats.wal_write++; if (written <= 0) { @@ -4877,6 +4877,7 @@ StartupXLOG(void) XLogCtlInsert *Insert; CheckPoint checkPoint; bool wasShutdown; + bool didCrash; bool haveTblspcMap; bool haveBackupLabel; XLogRecPtr EndOfLog; @@ -4994,7 +4995,10 @@ StartupXLOG(void) { RemoveTempXlogFiles(); SyncDataDirectory(); + didCrash = true; } + else + didCrash = false; /* * Prepare for WAL recovery if needed. @@ -5106,6 +5110,22 @@ StartupXLOG(void) */ restoreTwoPhaseData(); + /* + * When starting with crash recovery, reset pgstat data - it might not be + * valid. Otherwise restore pgstat data. It's safe to do this here, + * because postmaster will not yet have started any other processes. + * + * NB: Restoring replication slot stats relies on slot state to have + * already been restored from disk. + * + * TODO: With a bit of extra work we could just start with a pgstat file + * associated with the checkpoint redo location we're starting from. + */ + if (didCrash) + pgstat_discard_stats(); + else + pgstat_restore_stats(); + lastFullPageWrites = checkPoint.fullPageWrites; RedoRecPtr = XLogCtl->RedoRecPtr = XLogCtl->Insert.RedoRecPtr = checkPoint.redo; @@ -5180,11 +5200,6 @@ StartupXLOG(void) LocalMinRecoveryPointTLI = 0; } - /* - * Reset pgstat data, because it may be invalid after recovery. - */ - pgstat_reset_all(); - /* Check that the GUCs used to generate the WAL allow recovery */ CheckRequiredParameterValues(); @@ -6081,8 +6096,8 @@ LogCheckpointEnd(bool restartpoint) CheckpointStats.ckpt_sync_end_t); /* Accumulate checkpoint timing summary data, in milliseconds. */ - PendingCheckpointerStats.m_checkpoint_write_time += write_msecs; - PendingCheckpointerStats.m_checkpoint_sync_time += sync_msecs; + PendingCheckpointerStats.checkpoint_write_time += write_msecs; + PendingCheckpointerStats.checkpoint_sync_time += sync_msecs; /* * All of the published timing statistics are accounted for. Only @@ -8009,10 +8024,10 @@ issue_xlog_fsync(int fd, XLogSegNo segno, TimeLineID tli) INSTR_TIME_SET_CURRENT(duration); INSTR_TIME_SUBTRACT(duration, start); - WalStats.m_wal_sync_time += INSTR_TIME_GET_MICROSEC(duration); + PendingWalStats.wal_sync_time += INSTR_TIME_GET_MICROSEC(duration); } - WalStats.m_wal_sync++; + PendingWalStats.wal_sync++; } /* |