aboutsummaryrefslogtreecommitdiff
path: root/src/backend/storage/ipc/standby.c
Commit message (Collapse)AuthorAge
* Fix memory leak in LogStandbySnapshot().Tom Lane2013-06-04
| | | | | | | | | | | | | | The array allocated by GetRunningTransactionLocks() needs to be pfree'd when we're done with it. Otherwise we leak some memory during each checkpoint, if wal_level = hot_standby. This manifests as memory bloat in the checkpointer process, or in bgwriter in versions before we made the checkpointer separate. Reported and fixed by Naoya Anzai. Back-patch to 9.0 where the issue was introduced. In passing, improve comments for GetRunningTransactionLocks(), and add an Assert that we didn't overrun the palloc'd array.
* pgindent run for release 9.3Bruce Momjian2013-05-29
| | | | | This is the first run of the Perl-based pgindent script. Also update pgindent instructions.
* Add lock_timeout configuration parameter.Tom Lane2013-03-16
| | | | | | | | | | | | | This GUC allows limiting the time spent waiting to acquire any one heavyweight lock. In support of this, improve the recently-added timeout infrastructure to permit efficiently enabling or disabling multiple timeouts at once. That reduces the performance hit from turning on lock_timeout, though it's still not zero. Zoltán Böszörményi, reviewed by Tom Lane, Stephen Frost, and Hari Babu
* Update copyrights for 2013Bruce Momjian2013-01-01
| | | | | Fully update git head, and update back branches in ./COPYRIGHT and legal.sgml files.
* Don't advance checkPoint.nextXid near the end of a checkpoint sequence.Tom Lane2012-12-02
| | | | | | | | | | | | | | | | | | | | | | This reverts commit c11130690d6dca64267201a169cfb38c1adec5ef in favor of actually fixing the problem: namely, that we should never have been modifying the checkpoint record's nextXid at this point to begin with. The nextXid should match the state as of the checkpoint's logical WAL position (ie the redo point), not the state as of its physical position. It's especially bogus to advance it in some wal_levels and not others. In any case there is no need for the checkpoint record to carry the same nextXid shown in the XLOG_RUNNING_XACTS record just emitted by LogStandbySnapshot, as any replay operation will already have adopted that value as current. This fixes bug #7710 from Tarvi Pillessaar, and probably also explains bug #6291 from Daniel Farina, in that if a checkpoint were in progress at the instant of XID wraparound, the epoch bump would be lost as reported. (And, of course, these days there's at least a 50-50 chance of a checkpoint being in progress at any given instant.) Diagnosed by me and independently by Andres Freund. Back-patch to all branches supporting hot standby.
* Rearrange storage of data in xl_running_xacts.Simon Riggs2012-12-02
| | | | | | | | | | | | | Previously we stored all xids mixed together. Now we store top-level xids first, followed by all subxids. Also skip logging any subxids if the snapshot is suboverflowed, since there are potentially large numbers of them and they are not useful in that case anyway. Has value in the envisaged design for decoding of WAL. No planned effect on Hot Standby. Andres Freund, reviewed by me
* Cleanup VirtualXact at end of Hot Standby.Simon Riggs2012-11-29
|
* Split out rmgr rm_desc functions into their own filesAlvaro Herrera2012-11-28
| | | | | This is necessary (but not sufficient) to have them compilable outside of a backend environment.
* Clarify docs on hot standby lock releaseSimon Riggs2012-11-13
| | | | Andres Freund and Simon Riggs
* Introduce timeout handling frameworkAlvaro Herrera2012-07-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Management of timeouts was getting a little cumbersome; what we originally had was more than enough back when we were only concerned about deadlocks and query cancel; however, when we added timeouts for standby processes, the code got considerably messier. Since there are plans to add more complex timeouts, this seems a good time to introduce a central timeout handling module. External modules register their timeout handlers during process initialization, and later enable and disable them as they see fit using a simple API; timeout.c is in charge of keeping track of which timeouts are in effect at any time, installing a common SIGALRM signal handler, and calling setitimer() as appropriate to ensure timely firing of external handlers. timeout.c additionally supports pluggable modules to add their own timeouts, though this capability isn't exercised anywhere yet. Additionally, as of this commit, walsender processes are aware of timeouts; we had a preexisting bug there that made those ignore SIGALRM, thus being subject to unhandled deadlocks, particularly during the authentication phase. This has already been fixed in back branches in commit 0bf8eb2a, which see for more details. Main author: Zoltán Böszörményi Some review and cleanup by Álvaro Herrera Extensive reworking by Tom Lane
* Replace XLogRecPtr struct with a 64-bit integer.Heikki Linnakangas2012-06-24
| | | | | | | | | | | | | | This simplifies code that needs to do arithmetic on XLogRecPtrs. To avoid changing on-disk format of data pages, the LSN on data pages is still stored in the old format. That should keep pg_upgrade happy. However, we have XLogRecPtrs embedded in the control file, and in the structs that are sent over the replication protocol, so this changes breaks compatibility of pg_basebackup and server. I didn't do anything about this in this patch, per discussion on -hackers, the right thing to do would to be to change the replication protocol to be architecture-independent, so that you could use a newer version of pg_receivexlog, for example, against an older server version.
* Run pgindent on 9.2 source tree in preparation for first 9.3Bruce Momjian2012-06-10
| | | | commit-fest.
* Further corrections from the department of redundancy department.Robert Haas2012-05-02
| | | | Thom Brown
* Remove duplicate words in comments.Heikki Linnakangas2012-05-02
| | | | Found these with grep -r "for for ".
* Add missing Assert and fix inaccurate elog message in standby_redo().Tom Lane2012-02-04
| | | | | | All other WAL redo routines either call RestoreBkpBlocks() or Assert that they haven't been passed any backup blocks. Make this one do likewise. Also, fix incorrect routine name in its failure message.
* Resolve timing issue with logging locks for Hot Standby.Simon Riggs2012-01-23
| | | | | | | | | | We log AccessExclusiveLocks for replay onto standby nodes, but because of timing issues on ProcArray it is possible to log a lock that is still held by a just committed transaction that is very soon to be removed. To avoid any timing issue we avoid applying locks made by transactions with InvalidXid. Simon Riggs, bug report Tom Lane, diagnosis Pavan Deolasee
* Update copyright notices for year 2012.Bruce Momjian2012-01-01
|
* Derive oldestActiveXid at correct time for Hot Standby.Simon Riggs2011-11-02
| | | | | | | | | There was a timing window between when oldestActiveXid was derived and when it should have been derived that only shows itself under heavy load. Move code around to ensure correct timing of derivation. No change to StartupSUBTRANS() code, which is where this failed. Bug report by Chris Redekop
* Move Timestamp/Interval typedefs and basic macros into datatype/timestamp.h.Tom Lane2011-09-09
| | | | | | | | | | | As per my recent proposal, this refactors things so that these typedefs and macros are available in a header that can be included in frontend-ish code. I also changed various headers that were undesirably including utils/timestamp.h to include datatype/timestamp.h instead. Unsurprisingly, this showed that half the system was getting utils/timestamp.h by way of xlog.h. No actual code changes here, just header refactoring.
* Create VXID locks "lazily" in the main lock table.Robert Haas2011-08-04
| | | | | | | | | | | | | Instead of entering them on transaction startup, we materialize them only when someone wants to wait, which will occur only during CREATE INDEX CONCURRENTLY. In Hot Standby mode, the startup process must also be able to probe for conflicting VXID locks, but the lock need never be fully materialized, because the startup process does not use the normal lock wait mechanism. Since most VXID locks never need to touch the lock manager partition locks, this can significantly reduce blocking contention on read-heavy workloads. Patch by me. Review by Jeff Davis.
* Move CheckRecoveryConflictDeadlock() call to a safer place.Tom Lane2011-08-02
| | | | | | | | | | | | | This kluge was inserted in a spot apparently chosen at random: the lock manager's state is not yet fully set up for the wait, and in particular LockWaitCancel hasn't been armed by setting lockAwaited, so the ProcLock will not get cleaned up if the ereport is thrown. This seems to not cause any observable problem in trivial test cases, because LockReleaseAll will silently clean up the debris; but I was able to cause failures with tests involving subtransactions. Fixes breakage induced by commit c85c941470efc44494fd7a5f426ee85fc65c268c. Back-patch to all affected branches.
* Clean up most -Wunused-but-set-variable warnings from gcc 4.6Peter Eisentraut2011-04-11
| | | | | | This warning is new in gcc 4.6 and part of -Wall. This patch cleans up most of the noise, but there are some still warnings that are trickier to remove.
* pgindent run before PG 9.1 beta 1.Bruce Momjian2011-04-10
|
* Prevent intermittent hang in recovery from bgwriter interaction.Simon Riggs2011-03-23
| | | | | | Startup process waited for cleanup lock but when hot_standby = off the pid was not registered, so that the bgwriter would not wake the waiting process as intended.
* Fix error code for canceling statement due to conflict with recovery.Simon Riggs2011-01-31
| | | | | | | All retryable conflict errors now have an error code that indicates that a retry is possible, correcting my incomplete fix of 2010/05/12 Tatsuo Ishii and Simon Riggs, input from Robert Haas and Florian Pflug
* Stamp copyrights for year 2011.Bruce Momjian2011-01-01
|
* Try to save a kernel call in ResolveRecoveryConflictWithVirtualXIDs.Robert Haas2010-12-17
| | | | If there's no work to be done, just exit quickly, before initialization.
* Reset 'ps' display just once when resolving VXID conflicts.Robert Haas2010-12-17
| | | | | | | | | | | This prevents the word "waiting" from briefly disappearing from the ps status line when ResolveRecoveryConflictWithVirtualXIDs begins a new iteration of the outer loop. Along the way, remove some useless pgstat_report_waiting() calls; the startup process doesn't appear in pg_stat_activity. Fujii Masao
* Fix bugs in the hot standby known-assigned-xids tracking logic. If there'sHeikki Linnakangas2010-12-07
| | | | | | | | | | | | | | | | | | | | | | | an old transaction running in the master, and a lot of transactions have started and finished since, and a WAL-record is written in the gap between the creating the running-xacts snapshot and WAL-logging it, recovery will fail with "too many KnownAssignedXids" error. This bug was reported by Joachim Wieland on Nov 19th. In the same scenario, when fewer transactions have started so that all the xids fit in KnownAssignedXids despite the first bug, a more serious bug arises. We incorrectly initialize the clog code with the oldest still running transaction, and when we see the WAL record belonging to a transaction with an XID larger than one that committed already before the checkpoint we're recovering from, we zero the clog page containing the already committed transaction, leading to data loss. In hindsight, trying to track xids in the known-assigned-xids array before seeing the running-xacts record was too complicated. To fix that, hold XidGenLock while the running-xacts snapshot is taken and WAL-logged. That ensures that no transaction can begin or end in that gap, so that in recvoery we know that the snapshot contains all transactions running at that point in WAL.
* Move call to GetTopTransactionId() earlier in LockAcquire(),Simon Riggs2010-11-29
| | | | | | | | | | removing an infrequently occurring race condition in Hot Standby. An xid must be assigned before a lock appears in shared memory, rather than immediately after, else GetRunningTransactionLocks() may see InvalidTransactionId, causing assertion failures during lock processing on standby. Bug report and diagnosis by Fujii Masao, fix by me.
* Avoid spurious Hot Standby conflicts from btree delete records.Simon Riggs2010-11-15
| | | | | | | Similar conflicts were already avoided for related record types. Massive over-caution resulted in a usability bug. Clear theoretical basis for doing this is now confirmed by me. Request to remove from Heikki (twice), over-caution by me.
* Remove cvs keywords from all files.Magnus Hagander2010-09-20
|
* Bring some sanity to the trace_recovery_messages code and docs.Tom Lane2010-08-19
| | | | | | | | | | Per gripe from Fujii Masao, though this is not exactly his proposed patch. Categorize as DEVELOPER_OPTIONS and set context PGC_SIGHUP, as per Fujii, but set the default to LOG because higher values aren't really sensible (see the code for trace_recovery()). Fix the documentation to agree with the code and to try to explain what the variable actually does. Get rid of no-op calls trace_recovery(LOG), which accomplish nothing except to demonstrate that this option confuses even its author.
* Correct sundry errors in Hot Standby-related comments.Robert Haas2010-08-12
| | | | Fujii Masao
* pgindent run for 9.0, second runBruce Momjian2010-07-06
|
* Replace max_standby_delay with two parameters, max_standby_archive_delay andTom Lane2010-07-03
| | | | | | | | | | | | | | max_standby_streaming_delay, and revise the implementation to avoid assuming that timestamps found in WAL records can meaningfully be compared to clock time on the standby server. Instead, the delay limits are compared to the elapsed time since we last obtained a new WAL segment from archive or since we were last "caught up" to WAL data arriving via streaming replication. This avoids problems with clock skew between primary and standby, as well as other corner cases that the original coding would misbehave in, such as the primary server having significant idle time between transactions. Per my complaint some time ago and considerable ensuing discussion. Do some desultory editing on the hot standby documentation, too.
* Remove max_standby_delay message from ps display of recovery processItagaki Takahiro2010-06-14
| | | | | in waiting status. The parameter is not so interesting in ps display because it is referable in postgresql.conf.
* HS Defer buffer pin deadlock check until deadlock_timeout has expired.Simon Riggs2010-05-26
| | | | | | | | | | | | | During Hot Standby we need to check for buffer pin deadlocks when the Startup process begins to wait, in case it never wakes up again. We previously made the deadlock check immediately on the basis it was cheap, though clearer thinking and prima facie evidence shows that was too simple. Refactor existing code to make it easy to add in deferral of deadlock check until deadlock_timeout allowing a good reduction in deadlock checks since far few buffer pins are held for that duration. It's worth doing anyway, though major goal is to prevent further reports of context switching with high numbers of users on occasional tests.
* Add many new Asserts in code and fix simple bug that slipped throughSimon Riggs2010-05-14
| | | | without them, related to previous commit. Report by Bruce Momjian.
* Cleanup initialization of Hot Standby. Clarify working with reanalysisSimon Riggs2010-05-13
| | | | | | | | | of requirements and documentation on LogStandbySnapshot(). Fixes two minor bugs reported by Tom Lane that would lead to an incorrect snapshot after transaction wraparound. Also fix two other problems discovered that would give incorrect snapshots in certain cases. ProcArrayApplyRecoveryInfo() substantially rewritten. Some minor refactoring of xact_redo_apply() and ExpireTreeKnownAssignedTransactionIds().
* Clean up some awkward, inaccurate, and inefficient processing aroundTom Lane2010-05-02
| | | | | | | | | | | | MaxStandbyDelay. Use the GUC units mechanism for the value, and choose more appropriate timestamp functions for performing tests with it. Make the ps_activity manipulation in ResolveRecoveryConflictWithVirtualXIDs have behavior similar to ps_activity code elsewhere, notably not updating the display when update_process_title is off and not truncating the display contents at an arbitrarily-chosen length. Improve the docs to be explicit about what MaxStandbyDelay actually measures, viz the difference between primary and standby servers' clocks, and the possible hazards if their clocks aren't in sync.
* Introduce wal_level GUC to explicitly control if information needed forHeikki Linnakangas2010-04-28
| | | | | | | | | | | | | | | | | | | | | | archival or hot standby should be WAL-logged, instead of deducing that from other options like archive_mode. This replaces recovery_connections GUC in the primary, where it now has no effect, but it's still used in the standby to enable/disable hot standby. Remove the WAL-logging of "unlogged operations", like creating an index without WAL-logging and fsyncing it at the end. Instead, we keep a copy of the wal_mode setting and the settings that affect how much shared memory a hot standby server needs to track master transactions (max_connections, max_prepared_xacts, max_locks_per_xact) in pg_control. Whenever the settings change, at server restart, write a WAL record noting the new settings and update pg_control. This allows us to notice the change in those settings in the standby at the right moment, they used to be included in checkpoint records, but that meant that a changed value was not reflected in the standby until the first checkpoint after the change. Bump PG_CONTROL_VERSION and XLOG_PAGE_MAGIC. Whack XLOG_PAGE_MAGIC back to the sequence it used to follow, before hot standby and subsequent patches changed it to 0x9003.
* Fix various instances of "the the".Robert Haas2010-04-23
| | | | Two of these were pointed out by Erik Rijkers; the rest I found.
* Optimise btree delete processing when no active backends.Simon Riggs2010-04-22
| | | | | Clarify comments, downgrade a message to DEBUG and remove some debug counters. Direct from ideas by Heikki Linnakangas.
* Relax locking during GetCurrentVirtualXIDs(). Earlier improvementsSimon Riggs2010-04-21
| | | | | | | | | | to handling of btree delete records mean that all snapshot conflicts on standby now have a valid, useful latestRemovedXid. Our earlier approach using LW_EXCLUSIVE was useful when we didnt always have a valid value, though is no longer useful or necessary. Asserts added to code path to prove and ensure this is the case. This will reduce contention and improve performance of larger Hot Standby servers.
* Change some debug ereports to elogs, as requested by translation team.Simon Riggs2010-04-06
|
* Fix comment which was apparently copy-pasted from another function.Heikki Linnakangas2010-03-11
|
* pgindent run for 9.0Bruce Momjian2010-02-26
|
* Improvements to ps message of startup process during Hot Standby.Simon Riggs2010-02-13
| | | | | | Message is reset earlier and potential bug avoided. Andres Freund
* Re-enable max_standby_delay = -1 using deadlock detection on startupSimon Riggs2010-02-13
| | | | | | | | process. If startup waits on a buffer pin we send a request to all backends to cancel themselves if they are holding the buffer pin required and they are also waiting on a lock. If not, startup waits until max_standby_delay before cancelling any backend waiting for the requested buffer pin.