aboutsummaryrefslogtreecommitdiff
path: root/src/include/access/xlog.h
Commit message (Collapse)AuthorAge
...
* Don't waste the last segment of each 4GB logical log file.Heikki Linnakangas2012-06-24
| | | | | | | | | | | | | | | | The comments claimed that wasting the last segment made it easier to do calculations with XLogRecPtrs, because you don't have problems representing last-byte-position-plus-1 that way. In my experience, however, it only made things more complicated, because the there was two ways to represent the boundary at the beginning of a logical log file: logid = n+1 and xrecoff = 0, or as xlogid = n and xrecoff = 4GB - XLOG_SEG_SIZE. Some functions were picky about which representation was used. Also, use a 64-bit segment number instead of the log/seg combination, to point to a certain WAL segment. We assume that all platforms have a working 64-bit integer type nowadays. This is an incompatible change in WAL format, so bumping WAL version number.
* Fix an issue in recent walwriter hibernation patch.Tom Lane2012-05-08
| | | | | | | | | Users of asynchronous-commit mode expect there to be a guaranteed maximum delay before an async commit's WAL records get flushed to disk. The original version of the walwriter hibernation patch broke that. Add an extra shared-memory flag to allow async commits to kick the walwriter out of hibernation mode, without adding any noticeable overhead in cases where no action is needed.
* Reduce idle power consumption of walwriter and checkpointer processes.Tom Lane2012-05-08
| | | | | | | | | | | | | | | | | | | | | | | This patch modifies the walwriter process so that, when it has not found anything useful to do for many consecutive wakeup cycles, it extends its sleep time to reduce the server's idle power consumption. It reverts to normal as soon as it's done any successful flushes. It's still true that during any async commit, backends check for completed, unflushed pages of WAL and signal the walwriter if there are any; so that in practice the walwriter can get awakened and returned to normal operation sooner than the sleep time might suggest. Also, improve the checkpointer so that it uses a latch and a computed delay time to not wake up at all except when it has something to do, replacing a previous hardcoded 0.5 sec wakeup cycle. This also is primarily useful for reducing the server's power consumption when idle. In passing, get rid of the dedicated latch for signaling the walwriter in favor of using its procLatch, since that comports better with possible generic signal handlers using that latch. Also, fix a pre-existing bug with failure to save/restore errno in walwriter's signal handlers. Peter Geoghegan, somewhat simplified by Tom
* Allow pg_basebackup from standby node with safety checking.Simon Riggs2012-01-25
| | | | | | | Base backup follows recommended procedure, plus goes to great lengths to ensure that partial page writes are avoided. Jun Ishizuka and Fujii Masao, with minor modifications
* Remove useless 'needlock' argument from GetXLogInsertRecPtr. It was alwaysHeikki Linnakangas2012-01-11
| | | | passed as 'true'.
* Update copyright notices for year 2012.Bruce Momjian2012-01-01
|
* Send new protocol keepalive messages to standby servers.Simon Riggs2011-12-31
| | | | | Allows streaming replication users to calculate transfer latency and apply delay via internal functions. No external functions yet.
* Move BKP_REMOVABLE bit from individual WAL records to WAL page headers.Tom Lane2011-12-12
| | | | | | | | | | | | | | | | | | | Removing this bit from xl_info allows us to restore the old limit of four (not three) separate pages touched by a WAL record, which is needed for the upcoming SP-GiST feature, and will likely be useful elsewhere in future. When we implemented XLR_BKP_REMOVABLE in 2007, we had to do it like that because no special WAL-visible action was taken when starting a backup. However, now we force a segment switch when starting a backup, so a compressing WAL archiver (such as pglesslog) that uses the state shown in the current page header will not be fooled as to removability of backup blocks. The only downside is that the archiver will not return to compressing mode for up to one WAL page after the backup is over, which is a small price to pay for getting back the extra xl_info bit. In any case the archiver could look for XLOG_BACKUP_END records if it thought it was worth the trouble to do so. Bump XLOG_PAGE_MAGIC since this is effectively a change in WAL format.
* Don't set reachedMinRecoveryPoint during crash recovery. In crash recovery,Heikki Linnakangas2011-12-09
| | | | | | | | | | | | | | | we don't reach consistency before replaying all of the WAL. Rename the variable to reachedConsistency, to make its intention clearer. In master, that was an active bug because of the recent patch to immediately PANIC if a reference to a missing page is found in WAL after reaching consistency, as Tom Lane's test case demonstrated. In 9.1 and 9.0, the only consequence was a misleading "consistent recovery state reached at %X/%X" message in the log at the beginning of crash recovery (the database is not consistent at that point yet). In 8.4, the log message was not printed in crash recovery, even though there was a similar reachedMinRecoveryPoint local variable that was also set early. So, backpatch to 9.1 and 9.0.
* During recovery, if we reach consistent state and still have entries in theHeikki Linnakangas2011-12-02
| | | | | | | | | | | | | | | invalid-page hash table, PANIC immediately. Immediate PANIC is much better than waiting for end-of-recovery, which is what we did before, because the end-of-recovery might not come until months later if this is a standby server. Also refrain from creating a restartpoint if there are invalid-page entries in the hash table. Restarting recovery from such a restartpoint would not see the invalid references, and wouldn't be able to cross-check them when consistency is reached. That wouldn't matter when things are going smoothly, but the more sanity checks you have the better. Fujii Masao
* Wakeup WALWriter as needed for asynchronous commit performance.Simon Riggs2011-11-13
| | | | | | | | Previously we waited for wal_writer_delay before flushing WAL. Now we also wake WALWriter as soon as a WAL buffer page has filled. Significant effect observed on performance of asynchronous commits by Robert Haas, attributed to the ability to set hint bits on tuples earlier and so reducing contention caused by clog lookups.
* Move user functions related to WAL into xlogfuncs.cSimon Riggs2011-11-04
|
* Refactor xlog.c to create src/backend/postmaster/startup.cSimon Riggs2011-11-02
| | | | | Startup process now has its own dedicated file, just like all other special/background processes. Reduces role and size of xlog.c
* Move Timestamp/Interval typedefs and basic macros into datatype/timestamp.h.Tom Lane2011-09-09
| | | | | | | | | | | As per my recent proposal, this refactors things so that these typedefs and macros are available in a header that can be included in frontend-ish code. I also changed various headers that were undesirably including utils/timestamp.h to include datatype/timestamp.h instead. Unsurprisingly, this showed that half the system was getting utils/timestamp.h by way of xlog.h. No actual code changes here, just header refactoring.
* Clean up the #include mess a little.Tom Lane2011-09-04
| | | | | | | | | | | | | | | | | walsender.h should depend on xlog.h, not vice versa. (Actually, the inclusion was circular until a couple hours ago, which was even sillier; but Bruce broke it in the expedient rather than logically correct direction.) Because of that poor decision, plus blind application of pgrminclude, we had a situation where half the system was depending on xlog.h to include such unrelated stuff as array.h and guc.h. Clean up the header inclusion, and manually revert a lot of what pgrminclude had done so things build again. This episode reinforces my feeling that pgrminclude should not be run without adult supervision. Inclusion changes in header files in particular need to be reviewed with great care. More generally, it'd be good if we had a clearer notion of module layering to dictate which headers can sanely include which others ... but that's a big task for another day.
* Move AllowCascadeReplication() define from xlog.h to replication includeBruce Momjian2011-09-03
| | | | | | file. Per suggestion from Alvaro.
* Allow more include files to be compiled in their own by adding missingBruce Momjian2011-08-27
| | | | | | include dependencies. Modify pgcompinclude to skip a common fcinfo error.
* Cascading replication feature for streaming log-based replication.Simon Riggs2011-07-19
| | | | | | | | | Standby servers can now have WALSender processes, which can work with either WALReceiver or archive_commands to pass data. Fully updated docs, including new conceptual terms of sending server, upstream and downstream servers. WALSenders terminated when promote to master. Fujii Masao, review, rework and doc rewrite by Simon Riggs
* pgindent run before PG 9.1 beta 1.Bruce Momjian2011-04-10
|
* Hot Standby feedback for avoidance of cleanup conflicts on standby.Simon Riggs2011-02-16
| | | | | | | | | | | | | Standby optionally sends back information about oldestXmin of queries which is then checked and applied to the WALSender's proc->xmin. GetOldestXmin() is modified slightly to agree with GetSnapshotData(), so that all backends on primary include WALSender within their snapshots. Note this does nothing to change the snapshot xmin on either master or standby. Feedback piggybacks on the standby reply message. vacuum_defer_cleanup_age is no longer used on standby, though parameter still exists on primary, since some use cases still exist. Simon Riggs, review comments from Fujii Masao, Heikki Linnakangas, Robert Haas
* pg_ctl promoteRobert Haas2011-02-15
| | | | Fujii Masao, reviewed by Robert Haas, Stephen Frost, and Magnus Hagander.
* Send status updates back from standby server to master, indicating how farHeikki Linnakangas2011-02-10
| | | | | | | | | | the standby has written, flushed, and applied the WAL. At the moment, this is for informational purposes only, the values are only shown in pg_stat_replication system view, but in the future they will also be needed for synchronous replication. Extracted from Simon riggs' synchronous replication patch by Robert Haas, with some tweaking by me.
* Implement NOWAIT option for BASE_BACKUP commandMagnus Hagander2011-02-09
| | | | | | | | | | Specifying this option makes the server not wait for the xlog to be archived, or emit a warning that it can't, instead leaving the responsibility with the client. This is useful when the log is being streamed using the streaming protocol in parallel with the backup, without having log archiving enabled.
* Named restore points in recovery. Users can record named points, thenSimon Riggs2011-02-08
| | | | | | | new recovery.conf parameter recovery_target_name allows PITR to specify named points as recovery targets. Jaime Casanova, reviewed by Euler Taveira de Oliveira, plus minor edits
* Support multiple concurrent pg_basebackup backups.Heikki Linnakangas2011-01-31
| | | | | | | | | With this patch, pg_basebackup doesn't write a backup_label file in the data directory, so it doesn't interfere with a pg_start/stop_backup() based backup anymore. backup_label is still included in the backup, but it is injected directly into the tar stream. Heikki Linnakangas, reviewed by Fujii Masao and Magnus Hagander.
* Split pg_start_backup() and pg_stop_backup() into two piecesMagnus Hagander2011-01-09
| | | | | | | | | | Move the actual functionality into a separate function that's easier to call internally, and change the SQL-callable function to be a wrapper calling this. Also create a pg_abort_backup() function, only callable internally, that does only the most vital parts of pg_stop_backup(), making it safe(r) to call from error handlers.
* Stamp copyrights for year 2011.Bruce Momjian2011-01-01
|
* Instrument checkpoint sync calls.Robert Haas2010-12-14
| | | | Greg Smith, reviewed by Jeff Janes
* Remove cvs keywords from all files.Magnus Hagander2010-09-20
|
* Use a latch to make startup process wake up and replay immediately whenHeikki Linnakangas2010-09-15
| | | | | | | | | | new WAL arrives via streaming replication. This reduces the latency, and also allows us to use a longer polling interval, which is good for energy efficiency. We still need to poll to check for the appearance of a trigger file, but the interval is now 5 seconds (instead of 100ms), like when waiting for a new WAL segment to appear in WAL archive.
* Correct sundry errors in Hot Standby-related comments.Robert Haas2010-08-12
| | | | Fujii Masao
* Rename asyncCommitLSN to asyncXactLSN to reflect changed role in 9.0.Simon Riggs2010-07-29
| | | | | | Transaction aborts now record their LSN to avoid corner case behaviour in SR/HS, hence change of name of variables and functions. As pointed out by Fujii Masao. Cosmetic changes only.
* Replace max_standby_delay with two parameters, max_standby_archive_delay andTom Lane2010-07-03
| | | | | | | | | | | | | | max_standby_streaming_delay, and revise the implementation to avoid assuming that timestamps found in WAL records can meaningfully be compared to clock time on the standby server. Instead, the delay limits are compared to the elapsed time since we last obtained a new WAL segment from archive or since we were last "caught up" to WAL data arriving via streaming replication. This avoids problems with clock skew between primary and standby, as well as other corner cases that the original coding would misbehave in, such as the primary server having significant idle time between transactions. Per my complaint some time ago and considerable ensuing discussion. Do some desultory editing on the hot standby documentation, too.
* Don't allow walsender to send WAL data until it's been safely fsync'd on theTom Lane2010-06-17
| | | | | | | | master. Otherwise a subsequent crash could cause the master to lose WAL that has already been applied on the slave, resulting in the slave being out of sync and soon corrupt. Per recent discussion and an example from Robert Haas. Fujii Masao
* Make TriggerFile variable static. It's not used outside xlog.c.Heikki Linnakangas2010-06-10
| | | | Fujii Masao
* Rename the parameter recovery_connections to hot_standby, to reduce possibleTom Lane2010-04-29
| | | | | | | | confusion with streaming-replication settings. Also, change its default value to "off", because of concern about executing new and poorly-tested code during ordinary non-replicating operation. Per discussion. In passing do some minor editing of related documentation.
* Introduce wal_level GUC to explicitly control if information needed forHeikki Linnakangas2010-04-28
| | | | | | | | | | | | | | | | | | | | | | archival or hot standby should be WAL-logged, instead of deducing that from other options like archive_mode. This replaces recovery_connections GUC in the primary, where it now has no effect, but it's still used in the standby to enable/disable hot standby. Remove the WAL-logging of "unlogged operations", like creating an index without WAL-logging and fsyncing it at the end. Instead, we keep a copy of the wal_mode setting and the settings that affect how much shared memory a hot standby server needs to track master transactions (max_connections, max_prepared_xacts, max_locks_per_xact) in pg_control. Whenever the settings change, at server restart, write a WAL record noting the new settings and update pg_control. This allows us to notice the change in those settings in the standby at the right moment, they used to be included in checkpoint records, but that meant that a changed value was not reflected in the standby until the first checkpoint after the change. Bump PG_CONTROL_VERSION and XLOG_PAGE_MAGIC. Whack XLOG_PAGE_MAGIC back to the sequence it used to follow, before hot standby and subsequent patches changed it to 0x9003.
* Rename standby_keep_segments to wal_keep_segments.Robert Haas2010-04-20
| | | | | | Also, make the name of the GUC and the name of the backing variable match. Alnong the way, clean up a couple of slight typographical errors in the related docs.
* Remove some additional changes in previous commit that belong elsewhere.Simon Riggs2010-04-18
|
* Tune GetSnapshotData() during Hot Standby by avoiding loopSimon Riggs2010-04-18
| | | | | | | through normal backends. Makes code clearer also, since we avoid various Assert()s. Performance of snapshots taken during recovery no longer depends upon number of read-only backends.
* Change the logic to decide when to delete old WAL segments, so that itHeikki Linnakangas2010-04-12
| | | | | | | | | | doesn't take into account how far the WAL senders are. This way a hung WAL sender doesn't prevent old WAL segments from being recycled/removed in the primary, ultimately causing the disk to fill up. Instead add standby_keep_segments setting to control how many old WAL segments are kept in the primary. This also makes it more reliable to use streaming replication without WAL archiving, assuming that you set standby_keep_segments high enough.
* Refer to max_wal_senders in a more consistent fashion.Robert Haas2010-04-01
| | | | | | | The error message now makes explicit reference to the GUC that must be changed to fix the problem, using wording suggested by Tom Lane. Along the way, rename the GUC from MaxWalSenders to max_wal_senders for consistency and grep-ability.
* Adjust comment in .history file to match recovery target specified. CommentSimon Riggs2010-03-19
| | | | | | | | present since 8.0 was never fully meaningful, since two recovery targets cannot be specified. Refactor recovery target type to make this change and associated code easier to understand. No change in function. Bug report arising from internal support question.
* pgindent run for 9.0Bruce Momjian2010-02-26
|
* Remove old-style VACUUM FULL (which was known for a little while asTom Lane2010-02-08
| | | | | | | | | | | | | | | | | VACUUM FULL INPLACE), along with a boatload of subsidiary code and complexity. Per discussion, the use case for this method of vacuuming is no longer large enough to justify maintaining it; not to mention that we don't wish to invest the work that would be needed to make it play nicely with Hot Standby. Aside from the code directly related to old-style VACUUM FULL, this commit removes support for certain WAL record types that could only be generated within VACUUM FULL, redirect-pointer removal in heap_page_prune, and nontransactional generation of cache invalidation sinval messages (the last being the sticking point for Hot Standby). We still have to retain all code that copes with finding HEAP_MOVED_OFF and HEAP_MOVED_IN flag bits on existing tuples. This can't be removed as long as we want to support in-place update from pre-9.0 databases.
* Revoke augmentation of WAL records for btree delete, per discussion.Simon Riggs2010-02-01
|
* Augment WAL records for btree delete with GetOldestXmin() to reduceSimon Riggs2010-01-29
| | | | | | | | false positives during Hot Standby conflict processing. Simple patch to enhance conflict processing, following previous discussions. Controlled by parameter minimize_standby_conflicts = on | off, with default off allows measurement of performance impact to see whether it should be set on all the time.
* Change a few remaining calls of XLogArchivingActive() to useHeikki Linnakangas2010-01-28
| | | | | | | XLogIsNeeded() instead, to determine if an otherwise non-logged operation needs to be logged in WAL for standby servers. Fujii Masao
* Write a WAL record whenever we perform an operation without WAL-loggingHeikki Linnakangas2010-01-20
| | | | | | | | that would've been WAL-logged if archiving was enabled. If we encounter such records in archive recovery anyway, we know that some data is missing from the log. A WARNING is emitted in that case. Original patch by Fujii Masao, with changes by me.
* PGDLLIMPORT-ize the remaining variables needed by walreceiver.Tom Lane2010-01-16
|