aboutsummaryrefslogtreecommitdiff
path: root/src/backend/access/transam/xlog.c
Commit message (Collapse)AuthorAge
* Make the walwriter close it's handle to an old xlog segment if it's no longerMagnus Hagander2010-06-09
| | | | | | | | | the current one. Not doing this would leave the walwriter with a handle to a deleted file if there was nothing for it to do for a long period of time, preventing the file from being completely removed. Reported by Tollef Fog Heen, and thanks to Heikki for some hand-holding with the patch.
* Fix bug in %r handling in recovery_end_command, it always came out as 0Heikki Linnakangas2010-03-18
| | | | | | | because InRedo was cleared before recovery_end_command was executed. Also, always take ControlFileLock when reading checkpoint location for %r. That didn't matter before, but in 8.4 bgwriter is active during recovery and can modify the control file concurrently.
* Fix STOP WAL LOCATION in backup history files no to return the nextItagaki Takahiro2010-02-19
| | | | | | | | | | | segment of XLOG_BACKUP_END record even if the the record is placed at a segment boundary. Furthermore the previous implementation could return nonexistent segment file name when the boundary is in segments that has "FE" suffix; We never use segments with "FF" suffix. Backpatch to 8.0, where hot backup was introduced. Reported by Fujii Masao.
* Reset minRecoveryPoint at checkpoints, so that we don't uselessly updateHeikki Linnakangas2009-12-30
| | | | | | it in the control file at crash recovery following an archive recovery. Per Fujii Masao and subsequent discussion.
* Don't error out if recycling or removing an old WAL segment fails at the endHeikki Linnakangas2009-09-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | of checkpoint. Although the checkpoint has been written to WAL at that point already, so that all data is safe, and we'll retry removing the WAL segment at the next checkpoint, if such a failure persists we won't be able to remove any other old WAL segments either and will eventually run out of disk space. It's better to treat the failure as non-fatal, and move on to clean any other WAL segment and continue with any other end-of-checkpoint cleanup. We don't normally expect any such failures, but on Windows it can happen with some anti-virus or backup software that lock files without FILE_SHARE_DELETE flag. Also, the loop in pgrename() to retry when the file is locked was broken. If a file is locked on Windows, you get ERROR_SHARE_VIOLATION, not ERROR_ACCESS_DENIED, at least on modern versions. Fix that, although I left the check for ERROR_ACCESS_DENIED in there as well (presumably it was correct in some environment), and added ERROR_LOCK_VIOLATION to be consistent with similar checks in pgwin32_open(). Reduce the timeout on the loop from 30s to 10s, on the grounds that since it's been broken, we've effectively had a timeout of 0s and no-one has complained, so a smaller timeout is actually closer to the old behavior. A longer timeout would mean that if recycling a WAL file fails because it's locked for some reason, InstallXLogFileSegment() will hold ControlFileLock for longer, potentially blocking other backends, so a long timeout isn't totally harmless. While we're at it, set errno correctly in pgrename(). Backpatch to 8.2, which is the oldest version supported on Windows. The xlog.c changes would make sense on other platforms and thus on older versions as well, but since there's no such locking issues on other platforms, it's not worth it.
* On Windows, when a file is deleted and another process still has an openHeikki Linnakangas2009-09-10
| | | | | | | | | | | | | | | file handle on it, the file goes into "pending deletion" state where it still shows up in directory listing, but isn't accessible otherwise. That confuses RemoveOldXLogFiles(), making it think that the file hasn't been archived yet, while it actually was, and it was deleted along with the .done file. Fix that by renaming the file with ".deleted" extension before deleting it. Also check the return value of rename() and unlink(), so that if the removal fails for any reason (e.g another process is holding the file locked), we don't delete the .done file until the WAL file is really gone. Backpatch to 8.2, which is the oldest version supported on Windows.
* In the checkpoint written at the end of archive recovery, the WAL page headerHeikki Linnakangas2009-08-27
| | | | | | | | | | was incorrectly initialized with timeline ID 0. That rendered the WAL page unrecoverable, making a subsequent archive recovery stop at that point. ThisTimeLineID needs to be initialized before calling AdvanceXLInsertBuffer(). This fixes bug #5011 reported by James Bardin. Backpatch to 8.4, as the bug was introduced by the changes to use of bgwriter for writing the end-of-archive-recovery checkpoint. Patch by Tom Lane.
* Document that LocalSetXLogInsertAllowed can be re-executed.Tom Lane2009-08-08
| | | | Per comment from Simon.
* rm_cleanup functions need to be allowed to write WAL entries. This oversightTom Lane2009-08-07
| | | | | appears to explain the recent reports of "PANIC: cannot make new WAL entries during recovery".
* Cleanup and code review for the patch that made bgwriter active duringTom Lane2009-06-26
| | | | | | | | | | | | | archive recovery. Invent a separate state variable and inquiry function for XLogInsertAllowed() to clarify some tests and make the management of writing the end-of-recovery checkpoint less klugy. Fix several places that were incorrectly testing InRecovery when they should be looking at RecoveryInProgress or XLogInsertAllowed (because they will now be executed in the bgwriter not startup process). Clarify handling of bad LSNs passed to XLogFlush during recovery. Use a spinlock for setting/testing SharedRecoveryInProgress. Improve quite a lot of comments. Heikki and Tom
* Fix some serious bugs in archive recovery, now that bgwriter is activeHeikki Linnakangas2009-06-25
| | | | | | | | | | | | | | | | | | | | during it: When bgwriter is active, the startup process can't perform mdsync() correctly because it won't see the fsync requests accumulated in bgwriter's private pendingOpsTable. Therefore make bgwriter responsible for the end-of-recovery checkpoint as well, when it's active. When bgwriter is active (= archive recovery), the startup process must not accumulate fsync requests to its own pendingOpsTable, since bgwriter won't see them there when it performs restartpoints. Make startup process drop its pendingOpsTable when bgwriter is launched to avoid that. Update minimum recovery point one last time when leaving archive recovery. It won't be updated by the end-of-recovery checkpoint because XLogFlush() sees us as out of recovery already. This fixes bug #4879 reported by Fujii Masao.
* 8.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef listBruce Momjian2009-06-11
| | | | provided by Andrew.
* Only recycle normal files in pg_xlog as WAL segments. pg_standby createsHeikki Linnakangas2009-06-02
| | | | | | | | symbolic links with the -l option, and as Fujii Masao pointed out we ended up overwriting files in the archive directory before this patch. Patch by Aidan Van Dyk, Fujii Masao and me. Backpatch to 8.3, where pg_standby was introduced.
* When archiving is enabled, rotate the last WAL segment at shutdown so thatHeikki Linnakangas2009-05-28
| | | | | | all transactions are archived. Original patch by Guillaume Smet.
* Fix all the server-side SIGQUIT handlers (grumble ... why so many identicalTom Lane2009-05-15
| | | | | | | copies?) to ensure they really don't run proc_exit/shmem_exit callbacks, as was intended. I broke this behavior recently by installing atexit callbacks without thinking about the one case where we truly don't want to run those callback functions. Noted in an example from Dave Page.
* Improve a couple of comments.Tom Lane2009-05-14
|
* Add recovery_end_command option to recovery.conf. recovery_end_commandHeikki Linnakangas2009-05-14
| | | | | | | | | | | | | | is run at the end of archive recovery, providing a chance to do external cleanup. Modify pg_standby so that it no longer removes the trigger file, that is to be done using the recovery_end_command now. Provide a "smart" failover mode in pg_standby, where we don't fail over immediately, but only after recovering all unapplied WAL from the archive. That gives you zero data loss assuming all WAL was archived before failover, which is what most users of pg_standby actually want. recovery_end_command by Simon Riggs, pg_standby changes by Fujii Masao and myself.
* Request XLOG switch before writing checkpoint in pg_start_backup(). OtherwiseHeikki Linnakangas2009-05-07
| | | | | | | | | | | | | | | | | | you can end up with an unrecoverable backup if you start a new base backup right after finishing archive recovery. In that scenario, the redo pointer of the checkpoint that pg_start_backup() writes points to the XLOG segment where the timeline-changing end-of-archive-recovery checkpoint is. The beginning of that segment contains pages with the old timeline ID, and we don't accept that in recovery unless we find a history file covering the old timeline ID. If you omit pg_xlog from the base backup and clear the archive directory before starting the backup, there will be no such history file available. The bug is present in all versions since PITR was introduced in 8.0, but I'm back-patching only back to 8.2. Earlier versions didn't have XLOG switch records, making this fix unfeasible. Given the lack of reports until now, it doesn't seem worthwhile to spend more effort to fix 8.0 and 8.1. Per report and suggestion by Mikael Krantz
* After archive recovery, mark the last WAL segment from the parent timelineHeikki Linnakangas2009-04-22
| | | | | | | ready for archival. It was marked at the next checkpoint anyway, but waiting for the next checkpoint is an unnecessary delay. Fujii Masao
* Add an optional parameter to pg_start_backup() that specifies whether to doTom Lane2009-04-07
| | | | | | the checkpoint in immediate or lazy mode. This is to address complaints that pg_start_backup() takes a long time even when there's no need to minimize its I/O consumption.
* Code review for dtrace probes added (so far) to 8.4. Adjust placement ofTom Lane2009-03-11
| | | | | | | some bufmgr probes, take out redundant and memory-leak-inducing path arguments to smgr__md__read__done and smgr__md__write__done, fix bogus attempt to recalculate space used in sort__done, clean up formatting in places where I'm not sure pgindent will do a nice job by itself.
* Reload config file in startup process on SIGHUP.Heikki Linnakangas2009-03-04
| | | | Fujii Masao
* Change the signaling of end-of-recovery. Startup process now indicates endHeikki Linnakangas2009-02-23
| | | | | of recovery by exiting with exit code 0, like in previous releases. Per Tom's suggestion.
* Start background writer during archive recovery. Background writer now performsHeikki Linnakangas2009-02-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | its usual buffer cleaning duties during archive recovery, and it's responsible for performing restartpoints. This requires some changes in postmaster. When the startup process has done all the initialization and is ready to start WAL redo, it signals the postmaster to launch the background writer. The postmaster is signaled again when the point in recovery is reached where we know that the database is in consistent state. Postmaster isn't interested in that at the moment, but that's the point where we could let other backends in to perform read-only queries. The postmaster is signaled third time when the recovery has ended, so that postmaster knows that it's safe to start accepting connections. The startup process now traps SIGTERM, and performs a "clean" shutdown. If you do a fast shutdown during recovery, a shutdown restartpoint is performed, like a shutdown checkpoint, and postmaster kills the processes cleanly. You still have to continue the recovery at next startup, though. Currently, the background writer is only launched during archive recovery. We could launch it during crash recovery as well, but it seems better to keep that codepath as simple as possible, for the sake of robustness. And it couldn't do any restartpoints during crash recovery anyway, so it wouldn't be that useful. log_restartpoints is gone. Use log_checkpoints instead. This is yet to be documented. This whole operation is a pre-requisite for Hot Standby, but has some value of its own whether the hot standby patch makes 8.4 or not. Simon Riggs, with lots of modifications by me.
* Fix obsolete comment. Zdenek KotalaHeikki Linnakangas2009-02-07
|
* Put back fast-path for the case that there's no backup blocks inHeikki Linnakangas2009-01-23
| | | | | RestoreBkpBlocks. Went missing in my recent refactoring patch, as pointed out by Simon's hot standby patch.
* Add a new option to RestoreBkpBlocks() to indicate if a cleanup lock shouldHeikki Linnakangas2009-01-20
| | | | | | | | | be used instead of the normal exclusive lock, and make WAL redo functions responsible for calling RestoreBkpBlocks(). They know better what kind of a lock they need. At the moment, this just moves things around with no functional change, but makes the hot standby patch that's under review cleaner.
* Re-enable the old code in xlog.c that tried to use posix_fadvise(), so thatTom Lane2009-01-11
| | | | | | | we can get some buildfarm feedback about whether that function is still problematic. (Note that the planned async-preread patch will not really prove anything one way or the other in buildfarm testing, since it will be inactive with default GUC settings.)
* Update copyright for 2009.Bruce Momjian2009-01-01
|
* Change the name of dtrace wal tracepoints:Bruce Momjian2008-12-24
| | | | | | TRACE_POSTGRESQL_WAL_BUFFER_WRITE_DIRTY Robert Lor
* The attached patch contains a couple of fixes in the existing probes andBruce Momjian2008-12-17
| | | | | | | | | | | | | includes a few new ones. - Fixed compilation errors on OS X for probes that use typedefs - Fixed a number of probes to pass ForkNumber per the relation forks patch - The new probes are those that were taken out from the previous submitted patch and required simple fixes. Will submit the other probes that may require more discussion in a separate patch. Robert Lor
* If pg_stop_backup() is called just after switching to a new xlog file,Heikki Linnakangas2008-12-03
| | | | | | wait for the previous instead of the new file to be archived. Based on patch by Simon Riggs.
* Add a startup check that pg_xlog and pg_xlog/archive_status exist.Tom Lane2008-11-09
| | | | | | | If the latter doesn't exist, automatically recreate it. (We don't do this for pg_xlog, though, per discussion.) Jonah Harris
* Unite ReadBufferWithFork, ReadBufferWithStrategy, and ZeroOrReadBufferHeikki Linnakangas2008-10-31
| | | | | | | | | | | | functions into one ReadBufferExtended function, that takes the strategy and mode as argument. There's three modes, RBM_NORMAL which is the default used by plain ReadBuffer(), RBM_ZERO, which replaces ZeroOrReadBuffer, and a new mode RBM_ZERO_ON_ERROR, which allows callers to read corrupt pages without throwing an error. The FSM needs the new mode to recover from corrupt pages, which could happend if we crash after extending an FSM file, and the new page is "torn". Add fork number to some error messages in bufmgr.c, that still lacked it.
* Fix recoveryLastXTime logic so that it actually does what one would expect.Tom Lane2008-10-30
| | | | Per gripe from Kevin Grittner. Backpatch to 8.3, where the bug was introduced.
* Make LC_COLLATE and LC_CTYPE database-level settings. Collation andHeikki Linnakangas2008-09-23
| | | | | | | | ctype are now more like encoding, stored in new datcollate and datctype columns in pg_database. This is a stripped-down version of Radek Strnad's patch, with further changes by me.
* Fix a couple of problems pointed out by Fujii Masao in the 2008-Apr-05 patchTom Lane2008-09-08
| | | | | | | | | | for pg_stop_backup. First, it is possible that the history file name is not alphabetically later than the last WAL file name, so we should explicitly check that both have been archived. Second, the previous coding would wait forever if a checkpoint had managed to remove the WAL file before we look for it. Simon Riggs, plus some code cleanup by me.
* Introduce the concept of relation forks. An smgr relation can now consistHeikki Linnakangas2008-08-11
| | | | | | | | | | | | | | | | of multiple forks, and each fork can be created and grown separately. The bulk of this patch is about changing the smgr API to include an extra ForkNumber argument in every smgr function. Also, smgrscheduleunlink and smgrdounlink no longer implicitly call smgrclose, because other forks might still exist after unlinking one. The callers of those functions have been modified to call smgrclose instead. This patch in itself doesn't have any user-visible effect, but provides the infrastructure needed for upcoming patches. The additional forks envisioned are a rewritten FSM implementation that doesn't rely on a fixed-size shared memory block, and a visibility map to allow skipping portions of a table in VACUUM that have no dead tuples.
* Clean up the use of some page-header-access macros: principally, useTom Lane2008-07-13
| | | | | | | | | | SizeOfPageHeaderData instead of sizeof(PageHeaderData) in places where that makes the code clearer, and avoid casting between Page and PageHeader where possible. Zdenek Kotala, with some additional cleanup by Heikki Linnakangas. I did not apply the parts of the proposed patch that would have resulted in slightly changing the on-disk format of hash indexes; it seems to me that's not a win as long as there's any chance of having in-place upgrade for 8.4.
* Fix recovery.conf boolean variables to take the same range of stringBruce Momjian2008-06-30
| | | | values as postgresql.conf.
* Refactor XLogOpenRelation() and XLogReadBuffer() in preparation for relationHeikki Linnakangas2008-06-12
| | | | | | | | | | forks. XLogOpenRelation() and the associated light-weight relation cache in xlogutils.c is gone, and XLogReadBuffer() now takes a RelFileNode as argument, instead of Relation. For functions that still need a Relation struct during WAL replay, there's a new function called CreateFakeRelcacheEntry() that returns a fake entry like XLogOpenRelation() used to.
* Move BufferGetPageSize and BufferGetPage from bufpage.h to bufmgr.h. It isAlvaro Herrera2008-06-08
| | | | | | | | | | more logical that way, and also it reduces the amount of unnecessary includes in bufpage.h, which is widely used. Zdenek Kotala. My previous patch to bufpage.h should also have credited him as author, but I forgot (sorry about that).
* Set hidden field for guc enum missed in previous commit.Magnus Hagander2008-05-28
|
* Fix a subtle bug exposed by recent wal_sync_method rearrangements.Tom Lane2008-05-17
| | | | | | | | Formerly, the default value of wal_sync_method was determined inside xlog.c, but now it is determined inside guc.c. guc.c was reading xlogdefs.h without having read <fcntl.h>, leading to wrong determination of DEFAULT_SYNC_METHOD. Obviously xlogdefs.h needs to include <fcntl.h> for itself to ensure stable results.
* Reduce unnecessary PANIC to ERROR, improve a couple of comments.Tom Lane2008-05-16
|
* Remove the special variable for open_sync_bit used in O_SYNC and O_DSYNCMagnus Hagander2008-05-14
| | | | | | | | | modes, replacing it with a call to a function that derives it from the sync_method variable, now that it has distinct values for these two cases. This means that assign_xlog_sync_method() no longer changes any settings, thus fixing the bug introduced in the change to use a guc enum for wal_sync_method.
* Don't try to close negative file descriptors, since this can causeMagnus Hagander2008-05-13
| | | | | | | crashes on certain platforms. In particular, the MSVC runtime is known to do this. Fixes bug #4162, reported and diagnosed by Javier Pimas
* Fix breakage by the wal_sync_method patch in installations that useMagnus Hagander2008-05-12
| | | | O_DSYNC (specifically this broke all the Windows buildfarm members)
* Put back bufmgr.h in bufpage.h -- it is needed by some macros.Alvaro Herrera2008-05-12
| | | | | Remove #include bufmgr.h from (most?) source files which already include bufpage.h.
* Report which WAL sync method we are trying to change *to* when it fails,Magnus Hagander2008-05-12
| | | | not which one we had before (that worked, and thus is completley irrelevant)