| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
| |
the current one. Not doing this would leave the walwriter with a handle to a
deleted file if there was nothing for it to do for a long period of time,
preventing the file from being completely removed.
Reported by Tollef Fog Heen, and thanks to Heikki for some hand-holding with
the patch.
|
|
|
|
|
|
|
| |
because InRedo was cleared before recovery_end_command was executed.
Also, always take ControlFileLock when reading checkpoint location for
%r. That didn't matter before, but in 8.4 bgwriter is active during
recovery and can modify the control file concurrently.
|
|
|
|
|
|
|
|
|
|
|
| |
segment of XLOG_BACKUP_END record even if the the record is placed
at a segment boundary. Furthermore the previous implementation could
return nonexistent segment file name when the boundary is in segments
that has "FE" suffix; We never use segments with "FF" suffix.
Backpatch to 8.0, where hot backup was introduced.
Reported by Fujii Masao.
|
|
|
|
|
|
| |
it in the control file at crash recovery following an archive recovery.
Per Fujii Masao and subsequent discussion.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
of checkpoint. Although the checkpoint has been written to WAL at that point
already, so that all data is safe, and we'll retry removing the WAL segment at
the next checkpoint, if such a failure persists we won't be able to remove any
other old WAL segments either and will eventually run out of disk space. It's
better to treat the failure as non-fatal, and move on to clean any other WAL
segment and continue with any other end-of-checkpoint cleanup.
We don't normally expect any such failures, but on Windows it can happen with
some anti-virus or backup software that lock files without FILE_SHARE_DELETE
flag.
Also, the loop in pgrename() to retry when the file is locked was broken. If a
file is locked on Windows, you get ERROR_SHARE_VIOLATION, not
ERROR_ACCESS_DENIED, at least on modern versions. Fix that, although I left
the check for ERROR_ACCESS_DENIED in there as well (presumably it was correct
in some environment), and added ERROR_LOCK_VIOLATION to be consistent with
similar checks in pgwin32_open(). Reduce the timeout on the loop from 30s to
10s, on the grounds that since it's been broken, we've effectively had a
timeout of 0s and no-one has complained, so a smaller timeout is actually
closer to the old behavior. A longer timeout would mean that if recycling a
WAL file fails because it's locked for some reason, InstallXLogFileSegment()
will hold ControlFileLock for longer, potentially blocking other backends, so
a long timeout isn't totally harmless.
While we're at it, set errno correctly in pgrename().
Backpatch to 8.2, which is the oldest version supported on Windows. The xlog.c
changes would make sense on other platforms and thus on older versions as
well, but since there's no such locking issues on other platforms, it's not
worth it.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
file handle on it, the file goes into "pending deletion" state where it
still shows up in directory listing, but isn't accessible otherwise. That
confuses RemoveOldXLogFiles(), making it think that the file hasn't been
archived yet, while it actually was, and it was deleted along with the .done
file.
Fix that by renaming the file with ".deleted" extension before deleting it.
Also check the return value of rename() and unlink(), so that if the removal
fails for any reason (e.g another process is holding the file locked), we
don't delete the .done file until the WAL file is really gone.
Backpatch to 8.2, which is the oldest version supported on Windows.
|
|
|
|
|
|
|
|
|
|
| |
was incorrectly initialized with timeline ID 0. That rendered the WAL page
unrecoverable, making a subsequent archive recovery stop at that point.
ThisTimeLineID needs to be initialized before calling AdvanceXLInsertBuffer().
This fixes bug #5011 reported by James Bardin. Backpatch to 8.4, as the bug
was introduced by the changes to use of bgwriter for writing the
end-of-archive-recovery checkpoint. Patch by Tom Lane.
|
|
|
|
| |
Per comment from Simon.
|
|
|
|
|
| |
appears to explain the recent reports of "PANIC: cannot make new WAL entries
during recovery".
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
archive recovery. Invent a separate state variable and inquiry function
for XLogInsertAllowed() to clarify some tests and make the management of
writing the end-of-recovery checkpoint less klugy. Fix several places
that were incorrectly testing InRecovery when they should be looking at
RecoveryInProgress or XLogInsertAllowed (because they will now be executed
in the bgwriter not startup process). Clarify handling of bad LSNs passed
to XLogFlush during recovery. Use a spinlock for setting/testing
SharedRecoveryInProgress. Improve quite a lot of comments.
Heikki and Tom
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
during it:
When bgwriter is active, the startup process can't perform mdsync() correctly
because it won't see the fsync requests accumulated in bgwriter's private
pendingOpsTable. Therefore make bgwriter responsible for the end-of-recovery
checkpoint as well, when it's active.
When bgwriter is active (= archive recovery), the startup process must not
accumulate fsync requests to its own pendingOpsTable, since bgwriter won't
see them there when it performs restartpoints. Make startup process drop its
pendingOpsTable when bgwriter is launched to avoid that.
Update minimum recovery point one last time when leaving archive recovery.
It won't be updated by the end-of-recovery checkpoint because XLogFlush()
sees us as out of recovery already.
This fixes bug #4879 reported by Fujii Masao.
|
|
|
|
| |
provided by Andrew.
|
|
|
|
|
|
|
|
| |
symbolic links with the -l option, and as Fujii Masao pointed out we ended up
overwriting files in the archive directory before this patch. Patch by
Aidan Van Dyk, Fujii Masao and me.
Backpatch to 8.3, where pg_standby was introduced.
|
|
|
|
|
|
| |
all transactions are archived.
Original patch by Guillaume Smet.
|
|
|
|
|
|
|
| |
copies?) to ensure they really don't run proc_exit/shmem_exit callbacks,
as was intended. I broke this behavior recently by installing atexit
callbacks without thinking about the one case where we truly don't want
to run those callback functions. Noted in an example from Dave Page.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
is run at the end of archive recovery, providing a chance to do external
cleanup. Modify pg_standby so that it no longer removes the trigger file,
that is to be done using the recovery_end_command now.
Provide a "smart" failover mode in pg_standby, where we don't fail over
immediately, but only after recovering all unapplied WAL from the archive.
That gives you zero data loss assuming all WAL was archived before
failover, which is what most users of pg_standby actually want.
recovery_end_command by Simon Riggs, pg_standby changes by Fujii Masao and
myself.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
you can end up with an unrecoverable backup if you start a new base backup
right after finishing archive recovery. In that scenario, the redo pointer of
the checkpoint that pg_start_backup() writes points to the XLOG segment where
the timeline-changing end-of-archive-recovery checkpoint is. The beginning
of that segment contains pages with the old timeline ID, and we don't accept
that in recovery unless we find a history file covering the old timeline ID.
If you omit pg_xlog from the base backup and clear the archive directory
before starting the backup, there will be no such history file available.
The bug is present in all versions since PITR was introduced in 8.0, but I'm
back-patching only back to 8.2. Earlier versions didn't have XLOG switch
records, making this fix unfeasible. Given the lack of reports until now,
it doesn't seem worthwhile to spend more effort to fix 8.0 and 8.1.
Per report and suggestion by Mikael Krantz
|
|
|
|
|
|
|
| |
ready for archival. It was marked at the next checkpoint anyway, but
waiting for the next checkpoint is an unnecessary delay.
Fujii Masao
|
|
|
|
|
|
| |
the checkpoint in immediate or lazy mode. This is to address complaints
that pg_start_backup() takes a long time even when there's no need to minimize
its I/O consumption.
|
|
|
|
|
|
|
| |
some bufmgr probes, take out redundant and memory-leak-inducing path arguments
to smgr__md__read__done and smgr__md__write__done, fix bogus attempt to
recalculate space used in sort__done, clean up formatting in places where
I'm not sure pgindent will do a nice job by itself.
|
|
|
|
| |
Fujii Masao
|
|
|
|
|
| |
of recovery by exiting with exit code 0, like in previous releases. Per
Tom's suggestion.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
its usual buffer cleaning duties during archive recovery, and it's responsible
for performing restartpoints.
This requires some changes in postmaster. When the startup process has done
all the initialization and is ready to start WAL redo, it signals the
postmaster to launch the background writer. The postmaster is signaled again
when the point in recovery is reached where we know that the database is in
consistent state. Postmaster isn't interested in that at the moment, but
that's the point where we could let other backends in to perform read-only
queries. The postmaster is signaled third time when the recovery has ended,
so that postmaster knows that it's safe to start accepting connections.
The startup process now traps SIGTERM, and performs a "clean" shutdown. If
you do a fast shutdown during recovery, a shutdown restartpoint is performed,
like a shutdown checkpoint, and postmaster kills the processes cleanly. You
still have to continue the recovery at next startup, though.
Currently, the background writer is only launched during archive recovery.
We could launch it during crash recovery as well, but it seems better to keep
that codepath as simple as possible, for the sake of robustness. And it
couldn't do any restartpoints during crash recovery anyway, so it wouldn't be
that useful.
log_restartpoints is gone. Use log_checkpoints instead. This is yet to be
documented.
This whole operation is a pre-requisite for Hot Standby, but has some value of
its own whether the hot standby patch makes 8.4 or not.
Simon Riggs, with lots of modifications by me.
|
| |
|
|
|
|
|
| |
RestoreBkpBlocks. Went missing in my recent refactoring patch, as pointed
out by Simon's hot standby patch.
|
|
|
|
|
|
|
|
|
| |
be used instead of the normal exclusive lock, and make WAL redo functions
responsible for calling RestoreBkpBlocks(). They know better what kind of a
lock they need.
At the moment, this just moves things around with no functional change, but
makes the hot standby patch that's under review cleaner.
|
|
|
|
|
|
|
| |
we can get some buildfarm feedback about whether that function is still
problematic. (Note that the planned async-preread patch will not really
prove anything one way or the other in buildfarm testing, since it will
be inactive with default GUC settings.)
|
| |
|
|
|
|
|
|
| |
TRACE_POSTGRESQL_WAL_BUFFER_WRITE_DIRTY
Robert Lor
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
includes a few new ones.
- Fixed compilation errors on OS X for probes that use typedefs
- Fixed a number of probes to pass ForkNumber per the relation forks
patch
- The new probes are those that were taken out from the previous
submitted patch and required simple fixes. Will submit the other probes
that may require more discussion in a separate patch.
Robert Lor
|
|
|
|
|
|
| |
wait for the previous instead of the new file to be archived.
Based on patch by Simon Riggs.
|
|
|
|
|
|
|
| |
If the latter doesn't exist, automatically recreate it. (We don't do
this for pg_xlog, though, per discussion.)
Jonah Harris
|
|
|
|
|
|
|
|
|
|
|
|
| |
functions into one ReadBufferExtended function, that takes the strategy
and mode as argument. There's three modes, RBM_NORMAL which is the default
used by plain ReadBuffer(), RBM_ZERO, which replaces ZeroOrReadBuffer, and
a new mode RBM_ZERO_ON_ERROR, which allows callers to read corrupt pages
without throwing an error. The FSM needs the new mode to recover from
corrupt pages, which could happend if we crash after extending an FSM file,
and the new page is "torn".
Add fork number to some error messages in bufmgr.c, that still lacked it.
|
|
|
|
| |
Per gripe from Kevin Grittner. Backpatch to 8.3, where the bug was introduced.
|
|
|
|
|
|
|
|
| |
ctype are now more like encoding, stored in new datcollate and datctype
columns in pg_database.
This is a stripped-down version of Radek Strnad's patch, with further
changes by me.
|
|
|
|
|
|
|
|
|
|
| |
for pg_stop_backup. First, it is possible that the history file name is not
alphabetically later than the last WAL file name, so we should explicitly
check that both have been archived. Second, the previous coding would wait
forever if a checkpoint had managed to remove the WAL file before we look for
it.
Simon Riggs, plus some code cleanup by me.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
of multiple forks, and each fork can be created and grown separately.
The bulk of this patch is about changing the smgr API to include an extra
ForkNumber argument in every smgr function. Also, smgrscheduleunlink and
smgrdounlink no longer implicitly call smgrclose, because other forks might
still exist after unlinking one. The callers of those functions have been
modified to call smgrclose instead.
This patch in itself doesn't have any user-visible effect, but provides the
infrastructure needed for upcoming patches. The additional forks envisioned
are a rewritten FSM implementation that doesn't rely on a fixed-size shared
memory block, and a visibility map to allow skipping portions of a table in
VACUUM that have no dead tuples.
|
|
|
|
|
|
|
|
|
|
| |
SizeOfPageHeaderData instead of sizeof(PageHeaderData) in places where that
makes the code clearer, and avoid casting between Page and PageHeader where
possible. Zdenek Kotala, with some additional cleanup by Heikki Linnakangas.
I did not apply the parts of the proposed patch that would have resulted in
slightly changing the on-disk format of hash indexes; it seems to me that's
not a win as long as there's any chance of having in-place upgrade for 8.4.
|
|
|
|
| |
values as postgresql.conf.
|
|
|
|
|
|
|
|
|
|
| |
forks. XLogOpenRelation() and the associated light-weight relation cache in
xlogutils.c is gone, and XLogReadBuffer() now takes a RelFileNode as argument,
instead of Relation.
For functions that still need a Relation struct during WAL replay, there's a
new function called CreateFakeRelcacheEntry() that returns a fake entry like
XLogOpenRelation() used to.
|
|
|
|
|
|
|
|
|
|
| |
more logical that way, and also it reduces the amount of unnecessary includes
in bufpage.h, which is widely used.
Zdenek Kotala.
My previous patch to bufpage.h should also have credited him as author, but I
forgot (sorry about that).
|
| |
|
|
|
|
|
|
|
|
| |
Formerly, the default value of wal_sync_method was determined inside xlog.c,
but now it is determined inside guc.c. guc.c was reading xlogdefs.h
without having read <fcntl.h>, leading to wrong determination of
DEFAULT_SYNC_METHOD. Obviously xlogdefs.h needs to include <fcntl.h>
for itself to ensure stable results.
|
| |
|
|
|
|
|
|
|
|
|
| |
modes, replacing it with a call to a function that derives it from the
sync_method variable, now that it has distinct values for these two cases.
This means that assign_xlog_sync_method() no longer changes any settings,
thus fixing the bug introduced in the change to use a guc enum for
wal_sync_method.
|
|
|
|
|
|
|
| |
crashes on certain platforms. In particular, the MSVC runtime is known
to do this.
Fixes bug #4162, reported and diagnosed by Javier Pimas
|
|
|
|
| |
O_DSYNC (specifically this broke all the Windows buildfarm members)
|
|
|
|
|
| |
Remove #include bufmgr.h from (most?) source files which already include
bufpage.h.
|
|
|
|
| |
not which one we had before (that worked, and thus is completley irrelevant)
|