aboutsummaryrefslogtreecommitdiff
path: root/src/backend/access
Commit message (Collapse)AuthorAge
...
* Avoid pin scan for replay of XLOG_BTREE_VACUUMSimon Riggs2016-01-09
| | | | | | | | | | | | | | | | | | | | Replay of XLOG_BTREE_VACUUM during Hot Standby was previously thought to require complex interlocking that matched the requirements on the master. This required an O(N) operation that became a significant problem with large indexes, causing replication delays of seconds or in some cases minutes while the XLOG_BTREE_VACUUM was replayed. This commit skips the “pin scan” that was previously required, by observing in detail when and how it is safe to do so, with full documentation. The pin scan is skipped only in replay; the VACUUM code path on master is not touched here. The current commit still performs the pin scan for toast indexes, though this can also be avoided if we recheck scans on toast indexes. Later patch will address this. No tests included. Manual tests using an additional patch to view WAL records and their timing have shown the change in WAL records and their handling has successfully reduced replication delay.
* Fix overly-strict assertions in spgtextproc.c.Tom Lane2016-01-02
| | | | | | | | | | | | | | | | | spg_text_inner_consistent is capable of reconstructing an empty string to pass down to the next index level; this happens if we have an empty string coming in, no prefix, and a dummy node label. (In practice, what is needed to trigger that is insertion of a whole bunch of empty-string values.) Then, we will arrive at the next level with in->level == 0 and a non-NULL (but zero length) in->reconstructedValue, which is valid but the Assert tests weren't expecting it. Per report from Andreas Seltenreich. This has no impact in non-Assert builds, so should not be a problem in production, but back-patch to all affected branches anyway. In passing, remove a couple of useless variable initializations and shorten the code by not duplicating DatumGetPointer() calls.
* Update copyright for 2016Bruce Momjian2016-01-02
| | | | Backpatch certain files through 9.1
* Fix comments about WAL rule "write xlog before data" versus pg_multixact.Noah Misch2016-01-01
| | | | | | | | | | | Recovery does not achieve its goal of zeroing all pg_multixact entries whose accompanying WAL records never reached disk. Remove that claim and justify its expendability. Detail the need for TrimMultiXact(), which has little in common with the TrimCLOG() rationale. Merge two tightly-related comments. Stop presenting pg_multixact as specific to heap_lock_tuple(); PostgreSQL 9.3 extended its use to heap_update(). Noticed while investigating a report from Andres Freund.
* Rename (new|old)estCommitTs to (new|old)estCommitTsXidJoe Conway2015-12-28
| | | | | | | | | | | | | The variables newestCommitTs and oldestCommitTs sound as if they are timestamps, but in fact they are the transaction Ids that correspond to the newest and oldest timestamps rather than the actual timestamps. Rename these variables to reflect that they are actually xids: to wit newestCommitTsXid and oldestCommitTsXid respectively. Also modify related code in a similar fashion, particularly the user facing output emitted by pg_controldata and pg_resetxlog. Complaint and patch by me, review by Tom Lane and Alvaro Herrera. Backpatch to 9.5 where these variables were first introduced.
* Fix brin_summarize_new_values() to check index type and ownership.Tom Lane2015-12-26
| | | | | | | | | | brin_summarize_new_values() did not check that the passed OID was for an index at all, much less that it was a BRIN index, and would fail in obscure ways if it wasn't (possibly damaging data first?). It also lacked any permissions test; by analogy to VACUUM, we should only allow the table's owner to summarize. Noted by Jeff Janes, fix by Michael Paquier and me
* Fix typo in comment.Robert Haas2015-12-18
| | | | Amit Langote
* Provide a way to predefine LWLock tranche IDs.Robert Haas2015-12-15
| | | | | | | | | | | It's a bit cumbersome to use LWLockNewTrancheId(), because the returned value needs to be shared between backends so that each backend can call LWLockRegisterTranche() with the correct ID. So, for built-in tranches, use a hard-coded value instead. This is motivated by an upcoming patch adding further built-in tranches. Andres Freund and Robert Haas
* Fix bug in SetOffsetVacuumLimit() triggered by find_multixact_start() failure.Andres Freund2015-12-14
| | | | | | | | | | | | | | | | Previously, if find_multixact_start() failed, SetOffsetVacuumLimit() would install 0 into MultiXactState->offsetStopLimit if it previously succeeded. Luckily, there are no known cases where find_multixact_start() will return an error in 9.5 and above. But if it were to happen, for example due to filesystem permission issues, it'd be somewhat bad: GetNewMultiXactId() could continue allocating mxids even if close to a wraparound, or it could erroneously stop allocating mxids, even if no wraparound is looming. The wrong value would be corrected the next time SetOffsetVacuumLimit() is called, or by a restart. Reported-By: Noah Misch, although this is not his preferred fix Discussion: 20151210140450.GA22278@alap3.anarazel.de Backpatch: 9.5, where the bug was introduced as part of 4f627f
* Fix commit timestamp initializationAlvaro Herrera2015-12-11
| | | | | | | | | | | | | | | | | | | | | | This module needs explicit initialization in order to replay WAL records in recovery, but we had broken this recently following changes to make other (stranger) scenarios work correctly. To fix, rework the initialization sequence so that it always takes place before WAL replay commences for both master and standby. I could have gone for a more localized fix that just added a "startup" call for the master server, but it seemed better to restructure the existing callers as well so that the whole thing made more sense. As a drawback, there is more control logic in xlog.c now than previously, but doing otherwise meant passing down the ControlFile flag, which seemed uglier as a whole. This also meant adding a check to not re-execute ActivateCommitTs if it had already been called. Reported by Fujii Masao. Backpatch to 9.5.
* Improve some messagesPeter Eisentraut2015-12-10
|
* Fix bug leading to restoring unlogged relations from empty files.Andres Freund2015-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At the end of crash recovery, unlogged relations are reset to the empty state, using their init fork as the template. The init fork is copied to the main fork without going through shared buffers. Unfortunately WAL replay so far has not necessarily flushed writes from shared buffers to disk at that point. In normal crash recovery, and before the introduction of 'fast promotions' in fd4ced523 / 9.3, the END_OF_RECOVERY checkpoint flushes the buffers out in time. But with fast promotions that's not the case anymore. To fix, force WAL writes targeting the init fork to be flushed immediately (using the new FlushOneBuffer() function). In 9.5+ that flush can centrally be triggered from the code dealing with restoring full page writes (XLogReadBufferForRedoExtended), in earlier releases that responsibility is in the hands of XLOG_HEAP_NEWPAGE's replay function. Backpatch to 9.1, even if this currently is only known to trigger in 9.3+. Flushing earlier is more robust, and it is advantageous to keep the branches similar. Typical symptoms of this bug are errors like 'ERROR: index "..." contains unexpected zero page at block 0' shortly after promoting a node. Reported-By: Thom Brown Author: Andres Freund and Michael Paquier Discussion: 20150326175024.GJ451@alap3.anarazel.de Backpatch: 9.1-
* Further tweak commit_timestamp behaviorAlvaro Herrera2015-12-03
| | | | | | | | | | | | | | | | | | | | | | As pointed out by Fujii Masao, we weren't quite there on a standby behaving sanely: first because we were failing to acquire the correct state in the case where no XLOG_PARAMETER_CHANGE message was sent (because a checkpoint had already happened after the setting was changed in the master, and then the standby was restarted); and second because promoting the standby with the feature enabled failed to activate it if the master had the feature disabled. This patch fixes both those misbehaviors hopefully without re-introducing any old problems. Also change the hint emitted in a standby together with the error message about the feature being disabled, to make it point out that the place to chance the setting is the master. Otherwise, if the setting is already enabled in the standby, it is very confusing to have it say that the setting must be enabled ... Authors: Álvaro Herrera, Petr Jelínek. Backpatch to 9.5.
* Fix typo in comment.Robert Haas2015-11-19
| | | | Amit Langote
* Remove function names from some elog() calls in heapam.c.Andres Freund2015-11-19
| | | | | | | | | | At least one of the names was, due to a function renaming late in the development of ON CONFLICT, wrong. Since including function names in error messages is against the message style guide anyway, remove them from the messages. Discussion: CAM3SWZT8paz=usgMVHm0XOETkQvzjRtAUthATnmaHQQY0obnGw@mail.gmail.com Backpatch: 9.5, where ON CONFLICT was introduced
* Message improvementsPeter Eisentraut2015-11-16
|
* Move each SLRU's lwlocks to a separate tranche.Robert Haas2015-11-12
| | | | | | | | | | This makes it significantly easier to identify these lwlocks in LWLOCK_STATS or Trace_lwlocks output. It's also arguably better from a modularity standpoint, since lwlock.c no longer needs to know anything about the LWLock needs of the higher-level SLRU facility. Ildus Kurbangaliev, reviewd by Álvaro Herrera and by me.
* Pass extra data to bgworkers, and use this to fix parallel contexts.Robert Haas2015-11-05
| | | | | | | | | | | | | | | | | | | | | | | | Up until now, the total amount of data that could be passed to a background worker at startup was one datum, which can be a small as 4 bytes on some systems. That's enough to pass a dsm_handle or an array index, but not much else. Add a bgw_extra flag to the BackgroundWorker struct, allowing up to 128 bytes to be passed to a new worker on any platform. Use this to fix a problem I recently discovered with the parallel context machinery added in 9.5: the master assigns each worker an array index, and each worker subsequently assigns itself an array index, and there's nothing to guarantee that the two sets of indexes match, leading to chaos. Normally, I would not back-patch the change to add bgw_extra, since it is basically a feature addition. However, since 9.5 is still in beta and there seems to be no other sensible way to repair the broken parallel context machinery, back-patch to 9.5. Existing background worker code can ignore the bgw_extra field without a problem, but might need to be recompiled since the structure size has changed. Report and patch by me. Review by Amit Kapila.
* Fix serialization anomalies due to race conditions on INSERT.Kevin Grittner2015-10-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On insert the CheckForSerializableConflictIn() test was performed before the page(s) which were going to be modified had been locked (with an exclusive buffer content lock). If another process acquired a relation SIReadLock on the heap and scanned to a page on which an insert was going to occur before the page was so locked, a rw-conflict would be missed, which could allow a serialization anomaly to be missed. The window between the check and the page lock was small, so the bug was generally not noticed unless there was high concurrency with multiple processes inserting into the same table. This was reported by Peter Bailis as bug #11732, by Sean Chittenden as bug #13667, and by others. The race condition was eliminated in heap_insert() by moving the check down below the acquisition of the buffer lock, which had been the very next statement. Because of the loop locking and unlocking multiple buffers in heap_multi_insert() a check was added after all inserts were completed. The check before the start of the inserts was left because it might avoid a large amount of work to detect a serialization anomaly before performing the all of the inserts and the related WAL logging. While investigating this bug, other SSI bugs which were even harder to hit in practice were noticed and fixed, an unnecessary check (covered by another check, so redundant) was removed from heap_update(), and comments were improved. Back-patch to all supported branches. Kevin Grittner and Thomas Munro
* Update parallel executor support to reuse the same DSM.Robert Haas2015-10-30
| | | | | | | | | | | | | | | | Commit b0b0d84b3d663a148022e900ebfc164284a95f55 purported to make it possible to relaunch workers using the same parallel context, but it had an unpleasant race condition: we might reinitialize after the workers have sent their last control message but before they have dettached the DSM, leaving to crashes. Repair by introducing a new ParallelContext operation, ReinitializeParallelDSM. Adjust execParallel.c to use this new support, so that we can rescan a Gather node by relaunching workers but without needing to recreate the DSM. Amit Kapila, with some adjustments by me. Extracted from latest parallel sequential scan patch.
* Message style improvementsPeter Eisentraut2015-10-28
| | | | | Message style, plurals, quoting, spelling, consistency with similar messages
* Fix BRIN free space computationsAlvaro Herrera2015-10-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | A bug in the original free space computation made it possible to return a page which wasn't actually able to fit the item. Since the insertion code isn't prepared to deal with PageAddItem failing, a PANIC resulted ("failed to add BRIN tuple [to new page]"). Add a macro to encapsulate the correct computation, and use it in brin_getinsertbuffer's callers before calling that routine, to raise an early error. I became aware of the possiblity of a problem in this area while working on ccc4c074994d734. There's no archived discussion about it, but it's easy to reproduce a problem in the unpatched code with something like CREATE TABLE t (a text); CREATE INDEX ti ON t USING brin (a) WITH (pages_per_range=1); for length in `seq 8000 8196` do psql -f - <<EOF TRUNCATE TABLE t; INSERT INTO t VALUES ('z'), (repeat('a', $length)); EOF done Backpatch to 9.5, where BRIN was introduced.
* Cleanup commit timestamp module activaction, againAlvaro Herrera2015-10-27
| | | | | | | | | | | | | Further tweak commit_ts.c so that on a standby the state is completely consistent with what that in the master, rather than behaving differently in the cases that the settings differ. Now in standby and master the module should always be active or inactive in lockstep. Author: Petr Jelínek, with some further tweaks by Álvaro Herrera. Backpatch to 9.5, where commit timestamps were introduced. Discussion: http://www.postgresql.org/message-id/5622BF9D.2010409@2ndquadrant.com
* Fix typos in comments.Robert Haas2015-10-22
| | | | CharSyam
* Add a C API for parallel heap scans.Robert Haas2015-10-16
| | | | | | | | | | | | | | | | Using this API, one backend can set up a ParallelHeapScanDesc to which multiple backends can then attach. Each tuple in the relation will be returned to exactly one of the scanning backends. Only forward scans are supported, and rescans must be carefully coordinated. This is not exposed to the planner or executor yet. The original version of this code was written by me. Amit Kapila reviewed it, tested it, and improved it, including adding support for synchronized scans, per review comments from Jeff Davis. Extensive testing of this and related patches was performed by Haribabu Kommi. Final cleanup of this patch by me.
* Allow a parallel context to relaunch workers.Robert Haas2015-10-16
| | | | | | | | | This may allow some callers to avoid the overhead involved in tearing down a parallel context and then setting up a new one, which means releasing the DSM and then allocating and populating a new one. I suspect we'll want to revise the Gather node to make use of this new capability, but even if not it may be useful elsewhere and requires very little additional code.
* Prohibit parallel query when the isolation level is serializable.Robert Haas2015-10-16
| | | | | | | | | | | | In order for this to be safe, the code which hands true serializability will need to taught that the SIRead locks taken by a parallel worker pertain to the same transaction as those taken by the parallel leader. Some further changes may be needed as well. Until the necessary adaptations are made, don't generate parallel plans in serializable mode, and if a previously-generated parallel plan is used after serializable mode has been activated, run it serially. This fixes a bug in commit 7aea8e4f2daa4b39ca9d1309a0c4aadb0f7ed81b.
* Fix a problem with parallel workers being unable to restore role.Robert Haas2015-10-16
| | | | | | | | | check_role() tries to verify that the user has permission to become the requested role, but this is inappropriate in a parallel worker, which needs to exactly recreate the master's authorization settings. So skip the check in that case. This fixes a bug in commit 924bcf4f16d54c55310b28f77686608684734f42.
* Invalidate caches after cranking up a parallel worker transaction.Robert Haas2015-10-16
| | | | | | | | | | Starting a parallel worker transaction changes our notion of which XIDs are in-progress or committed, and our notion of the current command counter ID. Therefore, our view of these caches prior to starting this transaction may no longer valid. Defend against that by clearing them. This fixes a bug in commit 924bcf4f16d54c55310b28f77686608684734f42.
* Tighten up application of parallel mode checks.Robert Haas2015-10-16
| | | | | | | | | | | | | | Commit 924bcf4f16d54c55310b28f77686608684734f42 failed to enforce parallel mode checks during the commit of a parallel worker, because we exited parallel mode prior to ending the transaction so that we could pop the active snapshot. Re-establish parallel mode during parallel worker commit. Without this, it's far too easy for unsafe actions during the pre-commit sequence to crash the server instead of hitting the error checks as intended. Just to be extra paranoid, adjust a couple of the sanity checks in xact.c to check not only IsInParallelMode() but also IsParallelWorker().
* Transfer current command counter ID to parallel workers.Robert Haas2015-10-16
| | | | | | | | Commit 924bcf4f16d54c55310b28f77686608684734f42 correctly forbade parallel workers to modify the command counter while in parallel mode, but it inexplicably neglected to actually transfer the current command counter from leader to workers. This can result in the workers seeing a different set of tuples from the leader, which is bad. Repair.
* Don't send protocol messages to a shm_mq that no longer exists.Robert Haas2015-10-16
| | | | | | | | | | | Commit 2bd9e412f92bc6a68f3e8bcb18e04955cc35001d introduced a mechanism for relaying protocol messages from a background worker to another backend via a shm_mq. However, there was no provision for shutting down the communication channel. Therefore, a protocol message sent late in the shutdown sequence, such as a DEBUG message resulting from cranking up log_min_messages, could crash the server. To fix, install an on_dsm_detach callback that disables sending messages to the shm_mq when the associated DSM is detached.
* Re-Align *_freeze_max_age reloption limits with corresponding GUC limits.Andres Freund2015-10-05
| | | | | | | | | | | | | | In 020235a5754 I lowered the autovacuum_*freeze_max_age minimums to allow for easier testing of wraparounds. I did not touch the corresponding per-table limits. While those don't matter for the purpose of wraparound, it seems more consistent to lower them as well. It's noteworthy that the previous reloption lower limit for autovacuum_multixact_freeze_max_age was too high by one magnitude, even before 020235a5754. Discussion: 26377.1443105453@sss.pgh.pa.us Backpatch: back to 9.0 (in parts), like the prior patch
* Don't disable commit_ts in standby if enabled locallyAlvaro Herrera2015-10-02
| | | | Bug noticed by Fujii Masao
* Fix message punctuation according to style guidePeter Eisentraut2015-10-01
|
* Fix commit_ts for standbyAlvaro Herrera2015-10-01
| | | | | | | | | | | | | | | | | | | | | | | Module initialization was still not completely correct after commit 6b61955135e9, per crash report from Takashi Ohnishi. To fix, instead of trying to monkey around with the value of the GUC setting directly, add a separate boolean flag that enables the feature on a standby, but only for the startup (recovery) process, when it sees that its master server has the feature enabled. Discussion: http://www.postgresql.org/message-id/ca44c6c7f9314868bdc521aea4f77cbf@MP-MSGSS-MBX004.msg.nttdata.co.jp Also change the deactivation routine to delete all segment files rather than leaving the last one around. (This doesn't need separate WAL-logging, because on recovery we execute the same deactivation routine anyway.) In passing, clean up the code structure somewhat, particularly so that xlog.c doesn't know so much about when to activate/deactivate the feature. Thanks to Fujii Masao for testing and Petr Jelínek for off-list discussion. Back-patch to 9.5, where commit_ts was introduced.
* Don't dump core when destroying an unused ParallelContext.Robert Haas2015-09-30
| | | | | | If a transaction or subtransaction creates a ParallelContext but ends without calling InitializeParallelDSM, the previous code would seg fault. Fix that.
* Code review for transaction commit timestampsAlvaro Herrera2015-09-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | There are three main changes here: 1. No longer cause a start failure in a standby if the feature is disabled in postgresql.conf but enabled in the master. This reverts one part of commit 4f3924d9cd43; what we keep is the ability of the standby to activate/deactivate the module (which includes creating and removing segments as appropriate) during replay of such actions in the master. 2. Replay WAL records affecting commitTS even if the feature is disabled. This means the standby will always have the same state as the master after replay. 3. Have COMMIT PREPARE record the transaction commit time as well. We were previously only applying it in the normal transaction commit path. Author: Petr Jelínek Discussion: http://www.postgresql.org/message-id/CAHGQGwHereDzzzmfxEBYcVQu3oZv6vZcgu1TPeERWbDc+gQ06g@mail.gmail.com Discussion: http://www.postgresql.org/message-id/CAHGQGwFuzfO4JscM9LCAmCDCxp_MfLvN4QdB+xWsS-FijbjTYQ@mail.gmail.com Additionally, I cleaned up nearby code related to replication origins, which I found a bit hard to follow, and fixed a couple of typos. Backpatch to 9.5, where this code was introduced. Per bug reports from Fujii Masao and subsequent discussion.
* Fix "sesssion" typoAlvaro Herrera2015-09-28
| | | | | | | It was introduced alongside replication origins, by commit 5aa2350426c, so backpatch to 9.5. Pointed out by Fujii Masao
* Remove legacy multixact truncation support.Andres Freund2015-09-26
| | | | | | | | | | | | | In 9.5 and master there is no need to support legacy truncation. This is just committed separately to make it easier to backpatch the WAL logged multixact truncation to 9.3 and 9.4 if we later decide to do so. I bumped master's magic from 0xD086 to 0xD088 and 9.5's from 0xD085 to 0xD087 to avoid 9.5 reusing a value that has been in use on master while keeping the numbers increasing between major versions. Discussion: 20150621192409.GA4797@alap3.anarazel.de Backpatch: 9.5
* Rework the way multixact truncations work.Andres Freund2015-09-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The fact that multixact truncations are not WAL logged has caused a fair share of problems. Amongst others it requires to do computations during recovery while the database is not in a consistent state, delaying truncations till checkpoints, and handling members being truncated, but offset not. We tried to put bandaids on lots of these issues over the last years, but it seems time to change course. Thus this patch introduces WAL logging for multixact truncations. This allows: 1) to perform the truncation directly during VACUUM, instead of delaying it to the checkpoint. 2) to avoid looking at the offsets SLRU for truncation during recovery, we can just use the master's values. 3) simplify a fair amount of logic to keep in memory limits straight, this has gotten much easier During the course of fixing this a bunch of additional bugs had to be fixed: 1) Data was not purged from memory the member's SLRU before deleting segments. This happened to be hard or impossible to hit due to the interlock between checkpoints and truncation. 2) find_multixact_start() relied on SimpleLruDoesPhysicalPageExist - but that doesn't work for offsets that haven't yet been flushed to disk. Add code to flush the SLRUs to fix. Not pretty, but it feels slightly safer to only make decisions based on actual on-disk state. 3) find_multixact_start() could be called concurrently with a truncation and thus fail. Via SetOffsetVacuumLimit() that could lead to a round of emergency vacuuming. The problem remains in pg_get_multixact_members(), but that's quite harmless. For now this is going to only get applied to 9.5+, leaving the issues in the older branches in place. It is quite possible that we need to backpatch at a later point though. For the case this gets backpatched we need to handle that an updated standby may be replaying WAL from a not-yet upgraded primary. We have to recognize that situation and use "old style" truncation (i.e. looking at the SLRUs) during WAL replay. In contrast to before, this now happens in the startup process, when replaying a checkpoint record, instead of the checkpointer. Doing truncation in the restartpoint is incorrect, they can happen much later than the original checkpoint, thereby leading to wraparound. To avoid "multixact_redo: unknown op code 48" errors standbys would have to be upgraded before primaries. A later patch will bump the WAL page magic, and remove the legacy truncation codepaths. Legacy truncation support is just included to make a possible future backpatch easier. Discussion: 20150621192409.GA4797@alap3.anarazel.de Reviewed-By: Robert Haas, Alvaro Herrera, Thomas Munro Backpatch: 9.5 for now
* Allow autoanalyze to add pages deleted from pending list to FSMTeodor Sigaev2015-09-23
| | | | | | | | | | | | Commit e95680832854cf300e64c10de9cc2f586df558e8 introduces adding pages to FSM for ordinary insert, but autoanalyze was able just cleanup pending list without adding to FSM. Also fix double call of IndexFreeSpaceMapVacuum() during ginvacuumcleanup() Report from Fujii Masao Patch by me Review by Jeff Janes
* Add missing serial commaPeter Eisentraut2015-09-18
|
* Fix bug introduced by microvacuum for GiSTTeodor Sigaev2015-09-17
| | | | | | | | | | | | | | | Commit 013ebc0a7b7ea9c1b1ab7a3d4dd75ea121ea8ba7 introduces microvacuum for GiST, deletetion of tuple marked LP_DEAD uses IndexPageMultiDelete while recovery code uses IndexPageTupleDelete in loop. This causes a difference in offset numbers of tuples to delete. Patch introduces usage of IndexPageMultiDelete in GiST except gistplacetopage() where only one tuple is deleted at once. That also slightly improve performance, because IndexPageMultiDelete is more effective. Patch changes WAL format, so bump wal page magic. Bug report from Jeff Janes Diagnostic and patch by Anastasia Lubennikova and me
* Improve log messages related to tablespace_map fileFujii Masao2015-09-15
| | | | | | | | | | | | | | | | | | This patch changes the log message which is logged when the server successfully renames backup_label file to *.old but fails to rename tablespace_map file during the shutdown. Previously the WARNING message "online backup mode was not canceled" was logged in that case. However this message is confusing because the backup mode is treated as canceled whenever backup_label is successfully renamed. So this commit makes the server log the message "online backup mode canceled" in that case. Also this commit changes errdetail messages so that they follow the error message style guide. Back-patch to 9.5 where tablespace_map file is introduced. Original patch by Amit Kapila, heavily modified by me.
* Add missing ReleaseBuffer call in BRIN revmap codeAlvaro Herrera2015-09-11
| | | | | | | | I think this particular branch is actually dead, but the analysis to prove that is not trivial, so instead take the weasel way. Reported by Jinyu Zhang Backpatch to 9.5, where BRIN was introduced.
* Fix oversight in 013ebc0a7b7ea9c1b1ab7a3d4dd75ea121ea8ba7 commitTeodor Sigaev2015-09-09
| | | | Declaration of varibale inside ÓÝ×Õ
* Microvacuum for GISTTeodor Sigaev2015-09-09
| | | | | | | | | | | Mark index tuple as dead if it's pointed by kill_prior_tuple during ordinary (search) scan and remove it during insert process if there is no enough space for new tuple to insert. This improves select performance because index will not return tuple marked as dead and improves insert performance because it reduces number of page split. Anastasia Lubennikova <a.lubennikova@postgrespro.ru> with minor editorialization by me
* Remove files signaling a standby promotion request at postmaster startupFujii Masao2015-09-09
| | | | | | | | | | | | | | | | | | | | | | This commit makes postmaster forcibly remove the files signaling a standby promotion request. Otherwise, the existence of those files can trigger a promotion too early, whether a user wants that or not. This removal of files is usually unnecessary because they can exist only during a few moments during a standby promotion. However there is a race condition: if pg_ctl promote is executed and creates the files during a promotion, the files can stay around even after the server is brought up to new master. Then, if new standby starts by using the backup taken from that master, the files can exist at the server startup and should be removed in order to avoid an unexpected promotion. Back-patch to 9.1 where promote signal file was introduced. Problem reported by Feike Steenbergen. Original patch by Michael Paquier, modified by me. Discussion: 20150528100705.4686.91426@wrigleys.postgresql.org
* Allow per-tablespace effective_io_concurrencyAlvaro Herrera2015-09-08
| | | | | | | | | | Per discussion, nowadays it is possible to have tablespaces that have wildly different I/O characteristics from others. Setting different effective_io_concurrency parameters for those has been measured to improve performance. Author: Julien Rouhaud Reviewed by: Andres Freund