aboutsummaryrefslogtreecommitdiff
path: root/src/backend/storage/ipc/standby.c
Commit message (Collapse)AuthorAge
* Introduce timeout handling frameworkAlvaro Herrera2012-07-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Management of timeouts was getting a little cumbersome; what we originally had was more than enough back when we were only concerned about deadlocks and query cancel; however, when we added timeouts for standby processes, the code got considerably messier. Since there are plans to add more complex timeouts, this seems a good time to introduce a central timeout handling module. External modules register their timeout handlers during process initialization, and later enable and disable them as they see fit using a simple API; timeout.c is in charge of keeping track of which timeouts are in effect at any time, installing a common SIGALRM signal handler, and calling setitimer() as appropriate to ensure timely firing of external handlers. timeout.c additionally supports pluggable modules to add their own timeouts, though this capability isn't exercised anywhere yet. Additionally, as of this commit, walsender processes are aware of timeouts; we had a preexisting bug there that made those ignore SIGALRM, thus being subject to unhandled deadlocks, particularly during the authentication phase. This has already been fixed in back branches in commit 0bf8eb2a, which see for more details. Main author: Zoltán Böszörményi Some review and cleanup by Álvaro Herrera Extensive reworking by Tom Lane
* Replace XLogRecPtr struct with a 64-bit integer.Heikki Linnakangas2012-06-24
| | | | | | | | | | | | | | This simplifies code that needs to do arithmetic on XLogRecPtrs. To avoid changing on-disk format of data pages, the LSN on data pages is still stored in the old format. That should keep pg_upgrade happy. However, we have XLogRecPtrs embedded in the control file, and in the structs that are sent over the replication protocol, so this changes breaks compatibility of pg_basebackup and server. I didn't do anything about this in this patch, per discussion on -hackers, the right thing to do would to be to change the replication protocol to be architecture-independent, so that you could use a newer version of pg_receivexlog, for example, against an older server version.
* Run pgindent on 9.2 source tree in preparation for first 9.3Bruce Momjian2012-06-10
| | | | commit-fest.
* Further corrections from the department of redundancy department.Robert Haas2012-05-02
| | | | Thom Brown
* Remove duplicate words in comments.Heikki Linnakangas2012-05-02
| | | | Found these with grep -r "for for ".
* Add missing Assert and fix inaccurate elog message in standby_redo().Tom Lane2012-02-04
| | | | | | All other WAL redo routines either call RestoreBkpBlocks() or Assert that they haven't been passed any backup blocks. Make this one do likewise. Also, fix incorrect routine name in its failure message.
* Resolve timing issue with logging locks for Hot Standby.Simon Riggs2012-01-23
| | | | | | | | | | We log AccessExclusiveLocks for replay onto standby nodes, but because of timing issues on ProcArray it is possible to log a lock that is still held by a just committed transaction that is very soon to be removed. To avoid any timing issue we avoid applying locks made by transactions with InvalidXid. Simon Riggs, bug report Tom Lane, diagnosis Pavan Deolasee
* Update copyright notices for year 2012.Bruce Momjian2012-01-01
|
* Derive oldestActiveXid at correct time for Hot Standby.Simon Riggs2011-11-02
| | | | | | | | | There was a timing window between when oldestActiveXid was derived and when it should have been derived that only shows itself under heavy load. Move code around to ensure correct timing of derivation. No change to StartupSUBTRANS() code, which is where this failed. Bug report by Chris Redekop
* Move Timestamp/Interval typedefs and basic macros into datatype/timestamp.h.Tom Lane2011-09-09
| | | | | | | | | | | As per my recent proposal, this refactors things so that these typedefs and macros are available in a header that can be included in frontend-ish code. I also changed various headers that were undesirably including utils/timestamp.h to include datatype/timestamp.h instead. Unsurprisingly, this showed that half the system was getting utils/timestamp.h by way of xlog.h. No actual code changes here, just header refactoring.
* Create VXID locks "lazily" in the main lock table.Robert Haas2011-08-04
| | | | | | | | | | | | | Instead of entering them on transaction startup, we materialize them only when someone wants to wait, which will occur only during CREATE INDEX CONCURRENTLY. In Hot Standby mode, the startup process must also be able to probe for conflicting VXID locks, but the lock need never be fully materialized, because the startup process does not use the normal lock wait mechanism. Since most VXID locks never need to touch the lock manager partition locks, this can significantly reduce blocking contention on read-heavy workloads. Patch by me. Review by Jeff Davis.
* Move CheckRecoveryConflictDeadlock() call to a safer place.Tom Lane2011-08-02
| | | | | | | | | | | | | This kluge was inserted in a spot apparently chosen at random: the lock manager's state is not yet fully set up for the wait, and in particular LockWaitCancel hasn't been armed by setting lockAwaited, so the ProcLock will not get cleaned up if the ereport is thrown. This seems to not cause any observable problem in trivial test cases, because LockReleaseAll will silently clean up the debris; but I was able to cause failures with tests involving subtransactions. Fixes breakage induced by commit c85c941470efc44494fd7a5f426ee85fc65c268c. Back-patch to all affected branches.
* Clean up most -Wunused-but-set-variable warnings from gcc 4.6Peter Eisentraut2011-04-11
| | | | | | This warning is new in gcc 4.6 and part of -Wall. This patch cleans up most of the noise, but there are some still warnings that are trickier to remove.
* pgindent run before PG 9.1 beta 1.Bruce Momjian2011-04-10
|
* Prevent intermittent hang in recovery from bgwriter interaction.Simon Riggs2011-03-23
| | | | | | Startup process waited for cleanup lock but when hot_standby = off the pid was not registered, so that the bgwriter would not wake the waiting process as intended.
* Fix error code for canceling statement due to conflict with recovery.Simon Riggs2011-01-31
| | | | | | | All retryable conflict errors now have an error code that indicates that a retry is possible, correcting my incomplete fix of 2010/05/12 Tatsuo Ishii and Simon Riggs, input from Robert Haas and Florian Pflug
* Stamp copyrights for year 2011.Bruce Momjian2011-01-01
|
* Try to save a kernel call in ResolveRecoveryConflictWithVirtualXIDs.Robert Haas2010-12-17
| | | | If there's no work to be done, just exit quickly, before initialization.
* Reset 'ps' display just once when resolving VXID conflicts.Robert Haas2010-12-17
| | | | | | | | | | | This prevents the word "waiting" from briefly disappearing from the ps status line when ResolveRecoveryConflictWithVirtualXIDs begins a new iteration of the outer loop. Along the way, remove some useless pgstat_report_waiting() calls; the startup process doesn't appear in pg_stat_activity. Fujii Masao
* Fix bugs in the hot standby known-assigned-xids tracking logic. If there'sHeikki Linnakangas2010-12-07
| | | | | | | | | | | | | | | | | | | | | | | an old transaction running in the master, and a lot of transactions have started and finished since, and a WAL-record is written in the gap between the creating the running-xacts snapshot and WAL-logging it, recovery will fail with "too many KnownAssignedXids" error. This bug was reported by Joachim Wieland on Nov 19th. In the same scenario, when fewer transactions have started so that all the xids fit in KnownAssignedXids despite the first bug, a more serious bug arises. We incorrectly initialize the clog code with the oldest still running transaction, and when we see the WAL record belonging to a transaction with an XID larger than one that committed already before the checkpoint we're recovering from, we zero the clog page containing the already committed transaction, leading to data loss. In hindsight, trying to track xids in the known-assigned-xids array before seeing the running-xacts record was too complicated. To fix that, hold XidGenLock while the running-xacts snapshot is taken and WAL-logged. That ensures that no transaction can begin or end in that gap, so that in recvoery we know that the snapshot contains all transactions running at that point in WAL.
* Move call to GetTopTransactionId() earlier in LockAcquire(),Simon Riggs2010-11-29
| | | | | | | | | | removing an infrequently occurring race condition in Hot Standby. An xid must be assigned before a lock appears in shared memory, rather than immediately after, else GetRunningTransactionLocks() may see InvalidTransactionId, causing assertion failures during lock processing on standby. Bug report and diagnosis by Fujii Masao, fix by me.
* Avoid spurious Hot Standby conflicts from btree delete records.Simon Riggs2010-11-15
| | | | | | | Similar conflicts were already avoided for related record types. Massive over-caution resulted in a usability bug. Clear theoretical basis for doing this is now confirmed by me. Request to remove from Heikki (twice), over-caution by me.
* Remove cvs keywords from all files.Magnus Hagander2010-09-20
|
* Bring some sanity to the trace_recovery_messages code and docs.Tom Lane2010-08-19
| | | | | | | | | | Per gripe from Fujii Masao, though this is not exactly his proposed patch. Categorize as DEVELOPER_OPTIONS and set context PGC_SIGHUP, as per Fujii, but set the default to LOG because higher values aren't really sensible (see the code for trace_recovery()). Fix the documentation to agree with the code and to try to explain what the variable actually does. Get rid of no-op calls trace_recovery(LOG), which accomplish nothing except to demonstrate that this option confuses even its author.
* Correct sundry errors in Hot Standby-related comments.Robert Haas2010-08-12
| | | | Fujii Masao
* pgindent run for 9.0, second runBruce Momjian2010-07-06
|
* Replace max_standby_delay with two parameters, max_standby_archive_delay andTom Lane2010-07-03
| | | | | | | | | | | | | | max_standby_streaming_delay, and revise the implementation to avoid assuming that timestamps found in WAL records can meaningfully be compared to clock time on the standby server. Instead, the delay limits are compared to the elapsed time since we last obtained a new WAL segment from archive or since we were last "caught up" to WAL data arriving via streaming replication. This avoids problems with clock skew between primary and standby, as well as other corner cases that the original coding would misbehave in, such as the primary server having significant idle time between transactions. Per my complaint some time ago and considerable ensuing discussion. Do some desultory editing on the hot standby documentation, too.
* Remove max_standby_delay message from ps display of recovery processItagaki Takahiro2010-06-14
| | | | | in waiting status. The parameter is not so interesting in ps display because it is referable in postgresql.conf.
* HS Defer buffer pin deadlock check until deadlock_timeout has expired.Simon Riggs2010-05-26
| | | | | | | | | | | | | During Hot Standby we need to check for buffer pin deadlocks when the Startup process begins to wait, in case it never wakes up again. We previously made the deadlock check immediately on the basis it was cheap, though clearer thinking and prima facie evidence shows that was too simple. Refactor existing code to make it easy to add in deferral of deadlock check until deadlock_timeout allowing a good reduction in deadlock checks since far few buffer pins are held for that duration. It's worth doing anyway, though major goal is to prevent further reports of context switching with high numbers of users on occasional tests.
* Add many new Asserts in code and fix simple bug that slipped throughSimon Riggs2010-05-14
| | | | without them, related to previous commit. Report by Bruce Momjian.
* Cleanup initialization of Hot Standby. Clarify working with reanalysisSimon Riggs2010-05-13
| | | | | | | | | of requirements and documentation on LogStandbySnapshot(). Fixes two minor bugs reported by Tom Lane that would lead to an incorrect snapshot after transaction wraparound. Also fix two other problems discovered that would give incorrect snapshots in certain cases. ProcArrayApplyRecoveryInfo() substantially rewritten. Some minor refactoring of xact_redo_apply() and ExpireTreeKnownAssignedTransactionIds().
* Clean up some awkward, inaccurate, and inefficient processing aroundTom Lane2010-05-02
| | | | | | | | | | | | MaxStandbyDelay. Use the GUC units mechanism for the value, and choose more appropriate timestamp functions for performing tests with it. Make the ps_activity manipulation in ResolveRecoveryConflictWithVirtualXIDs have behavior similar to ps_activity code elsewhere, notably not updating the display when update_process_title is off and not truncating the display contents at an arbitrarily-chosen length. Improve the docs to be explicit about what MaxStandbyDelay actually measures, viz the difference between primary and standby servers' clocks, and the possible hazards if their clocks aren't in sync.
* Introduce wal_level GUC to explicitly control if information needed forHeikki Linnakangas2010-04-28
| | | | | | | | | | | | | | | | | | | | | | archival or hot standby should be WAL-logged, instead of deducing that from other options like archive_mode. This replaces recovery_connections GUC in the primary, where it now has no effect, but it's still used in the standby to enable/disable hot standby. Remove the WAL-logging of "unlogged operations", like creating an index without WAL-logging and fsyncing it at the end. Instead, we keep a copy of the wal_mode setting and the settings that affect how much shared memory a hot standby server needs to track master transactions (max_connections, max_prepared_xacts, max_locks_per_xact) in pg_control. Whenever the settings change, at server restart, write a WAL record noting the new settings and update pg_control. This allows us to notice the change in those settings in the standby at the right moment, they used to be included in checkpoint records, but that meant that a changed value was not reflected in the standby until the first checkpoint after the change. Bump PG_CONTROL_VERSION and XLOG_PAGE_MAGIC. Whack XLOG_PAGE_MAGIC back to the sequence it used to follow, before hot standby and subsequent patches changed it to 0x9003.
* Fix various instances of "the the".Robert Haas2010-04-23
| | | | Two of these were pointed out by Erik Rijkers; the rest I found.
* Optimise btree delete processing when no active backends.Simon Riggs2010-04-22
| | | | | Clarify comments, downgrade a message to DEBUG and remove some debug counters. Direct from ideas by Heikki Linnakangas.
* Relax locking during GetCurrentVirtualXIDs(). Earlier improvementsSimon Riggs2010-04-21
| | | | | | | | | | to handling of btree delete records mean that all snapshot conflicts on standby now have a valid, useful latestRemovedXid. Our earlier approach using LW_EXCLUSIVE was useful when we didnt always have a valid value, though is no longer useful or necessary. Asserts added to code path to prove and ensure this is the case. This will reduce contention and improve performance of larger Hot Standby servers.
* Change some debug ereports to elogs, as requested by translation team.Simon Riggs2010-04-06
|
* Fix comment which was apparently copy-pasted from another function.Heikki Linnakangas2010-03-11
|
* pgindent run for 9.0Bruce Momjian2010-02-26
|
* Improvements to ps message of startup process during Hot Standby.Simon Riggs2010-02-13
| | | | | | Message is reset earlier and potential bug avoided. Andres Freund
* Re-enable max_standby_delay = -1 using deadlock detection on startupSimon Riggs2010-02-13
| | | | | | | | process. If startup waits on a buffer pin we send a request to all backends to cancel themselves if they are holding the buffer pin required and they are also waiting on a lock. If not, startup waits until max_standby_delay before cancelling any backend waiting for the requested buffer pin.
* Fix typo bug in Hot Standby from recent refactoring. Bug introducedSimon Riggs2010-02-11
| | | | | into code recently patched by Andres Freund, so quickly fixed by him when bug report from Tatsuo Ishii arrived.
* Fix assorted poorly-thought-out message strings: use %u not %d for printingTom Lane2010-02-02
| | | | OIDs, avoid random line breaks in strings somebody might grep for.
* Detect early deadlock in Hot Standby when Startup is already waiting. FirstSimon Riggs2010-01-31
| | | | | | stage of required deadlock detection to allow re-enabling max_standby_delay setting of -1, which is now essential in the absence of improved relation- specific conflict resoluton. Requested by Greg Stark et al.
* Filter recovery conflicts based upon dboid from relfilenode of WALSimon Riggs2010-01-29
| | | | | | | | records for heap and btree. Minor change, mostly API changes to pass through the required values. This is a simple change though also provides the refactoring required for further enhancements to conflict processing using the relOid. Changes only have effect during Hot Standby.
* In HS, Startup process sets SIGALRM when waiting for buffer pin. IfSimon Riggs2010-01-23
| | | | | | | woken by alarm we send SIGUSR1 to all backends requesting that they check to see if they are blocking Startup process. If so, they throw ERROR/FATAL as for other conflict resolutions. Deadlock stop gap removed. max_standby_delay = -1 option removed to prevent deadlock.
* Message mentions msec when it should be seconds, so use s instead of ms.Simon Riggs2010-01-16
| | | | Noticed by Andres Freund
* Teach standby conflict resolution to use SIGUSR1Simon Riggs2010-01-16
| | | | | | | | | | Conflict reason is passed through directly to the backend, so we can take decisions about the effect of the conflict based upon the local state. No specific changes, as yet, though this prepares for later work. CancelVirtualTransaction() sends signals while holding ProcArrayLock. Introduce errdetail_abort() to give message detail explaining that the abort was caused by conflict processing. Remove CONFLICT_MODE states in favour of using PROCSIG_RECOVERY_CONFLICT states directly, for clarity.
* First part of refactoring of code for ResolveRecoveryConflict. PurposesSimon Riggs2010-01-14
| | | | | | | | of this are to centralise the conflict code to allow further change, as well as to allow passing through the full reason for the conflict through to the conflicting backends. Backend state alters how we can handle different types of conflict so this is now required. As originally suggested by Heikki, no longer optional.
* Update copyright for the year 2010.Bruce Momjian2010-01-02
|