aboutsummaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAge
* Spelling fixes in code commentsPeter Eisentraut2017-04-26
| | | | Author: Euler Taveira <euler@timbira.com.br>
* Fix typo in comment.Fujii Masao2017-04-27
| | | | Author: Masahiko Sawada
* Fix various concurrency issues in logical replication worker launchingPeter Eisentraut2017-04-26
| | | | | | | | | | | | | | | | | | | | | | | | | | The code was originally written with assumption that launcher is the only process starting the worker. However that hasn't been true since commit 7c4f52409 which failed to modify the worker management code adequately. This patch adds an in_use field to the LogicalRepWorker struct to indicate whether the worker slot is being used and uses proper locking everywhere this flag is set or read. However if the parent process dies while the new worker is starting and the new worker fails to attach to shared memory, this flag would never get cleared. We solve this rare corner case by adding a sort of garbage collector for in_use slots. This uses another field in the LogicalRepWorker struct named launch_time that contains the time when the worker was started. If any request to start a new worker does not find free slot, we'll check for workers that were supposed to start but took too long to actually do so, and reuse their slot. In passing also fix possible race conditions when stopping a worker that hasn't finished starting yet. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reported-by: Fujii Masao <masao.fujii@gmail.com>
* doc PG10: add Rafia Sabih to parallel index scan itemBruce Momjian2017-04-26
| | | | Reported-by: Amit Kapila
* Allow ALTER TABLE ONLY on partitioned tablesStephen Frost2017-04-25
| | | | | | | | | | | There is no need to forbid ALTER TABLE ONLY on partitioned tables, when no partitions exist yet. This can be handy for users who are building up their partitioned table independently and will create actual partitions later. In addition, this is how pg_dump likes to operate in certain instances. Author: Amit Langote, with some error message word-smithing by me
* doc PG10: update EXPLAIN SUMMARY itemBruce Momjian2017-04-25
| | | | Reported-by: Tels
* Wake up launcher when enabling a subscriptionPeter Eisentraut2017-04-25
| | | | | | | | | | | | Otherwise one would have to wait up to DEFAULT_NAPTIME_PER_CYCLE until the subscription worker is considered for starting. There is a small race condition: If one enables a subscription right after disabling it, the launcher might not have registered the stopping when receiving the wakeup signal for the re-enabling. The start will then not happen right away but after the full cycle time. Author: Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>
* doc: update PG 10 item about referencing many relationsBruce Momjian2017-04-25
| | | | Reported-by: Tom Lane
* doc: add PG 10 doc item about VACUUM truncation, 7e26e02eeBruce Momjian2017-04-25
| | | | Reported-by: Andres Freund
* doc PG10: add commit 090010f2e and adjust EXPLAIN SUMMARY itemBruce Momjian2017-04-25
| | | | Reported-by: Tels, Andres Freund
* doc: properly indent SGML tags in PG 10 release notesBruce Momjian2017-04-25
|
* Set the priorities of all quorum synchronous standbys to 1.Fujii Masao2017-04-26
| | | | | | | | | | | | | | | | | | In quorum-based synchronous replication, all the standbys listed in synchronous_standby_names equally have chances to be chosen as synchronous standbys. So they should have the same priority. However, previously, quorum standbys whose names appear earlier in the list were given higher priority values though the difference of those priority values didn't affect the selection of synchronous standbys. Users could see those "meaningless" priority values in pg_stat_replication and this was confusing. This commit gives all the quorum synchronous standbys the same highest priority, i.e., 1, in order to remove such confusion. Author: Fujii Masao Reviewed-by: Masahiko Sawada, Kyotaro Horiguchi Discussion: http://postgr.es/m/CAHGQGwEKOw=SmPLxJzkBsH6wwDBgOnVz46QjHbtsiZ-d-2RGUg@mail.gmail.com
* doc: PG 10 release notes updatesBruce Momjian2017-04-25
| | | | Reported-by: Michael Paquier, Felix Gerzaguet
* doc: PG 10 release note updatesBruce Momjian2017-04-25
| | | | Reported-by: David Rowley, Amit Langote, Ashutosh Bapat
* Adjust outdated comment.Robert Haas2017-04-25
| | | | | | | | | | Commit 5dfc198146b49ce7ecc8a1fc9d5e171fb75f6ba5 removed the only existing caller of hash_freeze, but left behind a comment indicating that hash_freeze was still used. Adjust. Kyotaro Horiguchi Discussion: http://postgr.es/m/20170424.165541.230634914.horiguchi.kyotaro@lab.ntt.co.jp
* Update copyright in recently added files.Fujii Masao2017-04-25
| | | | | | This commit also fixes copyright line missed by the automated script. Author: Masahiko Sawada
* doc: move hash info to new section and split out growth itemBruce Momjian2017-04-25
| | | | Reported-by: Amit Kapila
* doc: move hash performance item into index sectionBruce Momjian2017-04-24
| | | | | | | The requirement to rebuild pg_upgrade-ed hash indexes was kept in the incompatibilities section. Reported-by: Amit Kapila
* doc: add Rafia Sabih to PG 10 release note itemBruce Momjian2017-04-24
| | | | Reported-by: Amit Kapila
* doc: fix PG 10 release note doc markupBruce Momjian2017-04-24
|
* doc: merge PG 10 release SysV itemBruce Momjian2017-04-24
| | | | Reported-by: Takayuki Tsunakawa
* postgres_fdw: Fix join push down with extensionsPeter Eisentraut2017-04-24
| | | | | | | | | | | | | | | | | | | | Objects in an extension are shippable to a foreign server if the extension is part of the foreign server definition's shippable extensions list. But this was not properly considered in some cases when checking whether a join condition can be pushed to a foreign server and the join condition uses an object from a shippable extension. So the join would never be pushed down in those cases. So, the list of extensions needs to be made available in fpinfo of the relation being considered to be pushed down before any expressions are assessed for being shippable. Fix foreign_join_ok() to do that for a join relation. The code to save FDW options in fpinfo is scattered at multiple places. Bring all of that together into functions apply_server_options(), apply_table_options(), and merge_fdw_options(). David Rowley and Ashutosh Bapat, per report from David Rowley
* doc: PG 10 fixesBruce Momjian2017-04-24
| | | | Reported-by: Takayuki Tsunakawa
* doc: several minor PG 10 doc adjustmentsBruce Momjian2017-04-24
|
* doc: fix attribution of sequence item, order incompatibilitiesBruce Momjian2017-04-24
| | | | Reported-by: Andreas Karlsson
* doc: first draft of Postgres 10 release notesBruce Momjian2017-04-24
|
* doc: update release doc markup instructionsBruce Momjian2017-04-24
|
* Revert "Use pselect(2) not select(2), if available, to wait in postmaster's ↵Tom Lane2017-04-24
| | | | | | | | | | | | | | loop." This reverts commit 81069a9efc5a374dd39874a161f456f0fb3afba4. Buildfarm results suggest that some platforms have versions of pselect(2) that are not merely non-atomic, but flat out non-functional. Revert the use-pselect patch to confirm this diagnosis (and exclude the no-SA_RESTART patch as the source of trouble). If it's so, we should probably look into blacklisting specific platforms that have broken pselect. Discussion: https://postgr.es/m/9696.1493072081@sss.pgh.pa.us
* Use pselect(2) not select(2), if available, to wait in postmaster's loop.Tom Lane2017-04-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Traditionally we've unblocked signals, called select(2), and then blocked signals again. The code expects that the select() will be cancelled with EINTR if an interrupt occurs; but there's a race condition, which is that an already-pending signal will be delivered as soon as we unblock, and then when we reach select() there will be nothing preventing it from waiting. This can result in a long delay before we perform any action that ServerLoop was supposed to have taken in response to the signal. As with the somewhat-similar symptoms fixed by commit 893902085, the main practical problem is slow launching of parallel workers. The window for trouble is usually pretty short, corresponding to one iteration of ServerLoop; but it's not negligible. To fix, use pselect(2) in place of select(2) where available, as that's designed to solve exactly this problem. Where not available, we continue to use the old way, and are no worse off than before. pselect(2) has been required by POSIX since about 2001, so most modern platforms should have it. A bigger portability issue is that some implementations are said to be non-atomic, ie pselect() isn't really any different from unblock/select/reblock. Still, we're no worse off than before on such a platform. There is talk of rewriting the postmaster to use a WaitEventSet and not do signal response work in signal handlers, at which point this could be reverted, since we'd be using a self-pipe to solve the race condition. But that's not happening before v11 at the earliest. Back-patch to 9.6. The problem exists much further back, but the worst symptom arises only in connection with parallel query, so it does not seem worth taking any portability risks in older branches. Discussion: https://postgr.es/m/9205.1492833041@sss.pgh.pa.us
* Run the postmaster's signal handlers without SA_RESTART.Tom Lane2017-04-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The postmaster keeps signals blocked everywhere except while waiting for something to happen in ServerLoop(). The code expects that the select(2) will be cancelled with EINTR if an interrupt occurs; without that, followup actions that should be performed by ServerLoop() itself will be delayed. However, some platforms interpret the SA_RESTART signal flag as meaning that they should restart rather than cancel the select(2). Worse yet, some of them restart it with the original timeout delay, meaning that a steady stream of signal interrupts can prevent ServerLoop() from iterating at all if there are no incoming connection requests. Observable symptoms of this, on an affected platform such as HPUX 10, include extremely slow parallel query startup (possibly as much as 30 seconds) and failure to update timestamps on the postmaster's sockets and lockfiles when no new connections arrive for a long time. We can fix this by running the postmaster's signal handlers without SA_RESTART. That would be quite a scary change if the range of code where signals are accepted weren't so tiny, but as it is, it seems safe enough. (Note that postmaster children do, and must, reset all the handlers before unblocking signals; so this change should not affect any child process.) There is talk of rewriting the postmaster to use a WaitEventSet and not do signal response work in signal handlers, at which point it might be appropriate to revert this patch. But that's not happening before v11 at the earliest. Back-patch to 9.6. The problem exists much further back, but the worst symptom arises only in connection with parallel query, so it does not seem worth taking any portability risks in older branches. Discussion: https://postgr.es/m/9205.1492833041@sss.pgh.pa.us
* Get rid of extern declarations of non-existent functions.Fujii Masao2017-04-25
| | | | | | Those extern declartions were mistakenly added by commit 7c4f52409. Author: Petr Jelinek
* Fix postmaster's handling of fork failure for a bgworker process.Tom Lane2017-04-24
| | | | | | | | | | | | | | | | | | | | | This corner case didn't behave nicely at all: the postmaster would (partially) update its state as though the process had started successfully, and be quite confused thereafter. Fix it to act like the worker had crashed, instead. In passing, refactor so that do_start_bgworker contains all the state-change logic for bgworker launch, rather than just some of it. Back-patch as far as 9.4. 9.3 contains similar logic, but it's just enough different that I don't feel comfortable applying the patch without more study; and the use of bgworkers in 9.3 was so small that it doesn't seem worth the extra work. transam/parallel.c is still entirely unprepared for the possibility of bgworker startup failure, but that seems like material for a separate patch. Discussion: https://postgr.es/m/4905.1492813727@sss.pgh.pa.us
* Code review for commands/statscmds.c.Tom Lane2017-04-24
| | | | | | | | | | | | | | | | | | | | Fix machine-dependent sorting of column numbers. (Odd behavior would only materialize for column numbers above 255, but that's certainly legal.) Fix poor choice of SQLSTATE for some errors, and improve error message wording. (Notably, "is not a scalar type" is a totally misleading way to explain "does not have a default btree opclass".) Avoid taking AccessExclusiveLock on the associated relation during DROP STATISTICS. That's neither necessary nor desirable, and it could easily have put us into situations where DROP fails (compare commit 68ea2b7f9). Adjust/improve comments. David Rowley and Tom Lane Discussion: https://postgr.es/m/CAKJS1f-GmCfPvBbAEaM5xoVOaYdVgVN1gicALSoYQ77z-+vLbw@mail.gmail.com
* Don't include sys/poll.h anymore.Andres Freund2017-04-23
| | | | | | | | | | poll.h is mandated by Single Unix Spec v2, the usual baseline for postgres on unix. None of the unixoid buildfarms animals has sys/poll.h but not poll.h. Therefore there's not much point to test for sys/poll.h's existence and include it optionally. Author: Andres Freund, per suggestion from Tom Lane Discussion: https://postgr.es/m/20505.1492723662@sss.pgh.pa.us
* Zero padding in replication origin's checkpointed on disk-state.Andres Freund2017-04-23
| | | | | | | | | | | | | | | | This seems to be largely cosmetic, avoiding valgrind bleats and the like. The uninitialized padding influences the CRC of the on-disk entry, but because it's also used when verifying the CRC, that doesn't cause spurious failures. Backpatch nonetheless. It's a bit unfortunate that contrib/test_decoding/sql/replorigin.sql doesn't exercise the checkpoint path, but checkpoints are fairly expensive on weaker machines, and we'd have to stop/start for that to be meaningful. Author: Andres Freund Discussion: https://postgr.es/m/20170422183123.w2jgiuxtts7qrqaq@alap3.anarazel.de Backpatch: 9.5, where replication origins were introduced
* Initialize all memory for logical replication relation cache.Andres Freund2017-04-23
| | | | | | | | | | | As reported by buildfarm animal skink / valgrind, some of the variables weren't always initialized. To avoid further mishaps use memset to ensure the entire entry is initialized. Author: Petr Jelinek Reported-By: Andres Freund Discussion: https://postgr.es/m/20170422183123.w2jgiuxtts7qrqaq@alap3.anarazel.de Backpatch: none, code new in master
* Remove select(2) backed latch implementation.Andres Freund2017-04-23
| | | | | | | | | | | | | poll(2) is required by Single Unix Spec v2, the usual baseline for postgres (leaving windows aside). There's not been any buildfarm animals without poll(2) for a long while, leaving the select(2) implementation to be largely untested. On windows, including mingw, poll() is not available, but we have a special case implementation for windows anyway. Author: Andres Freund Discussion: https://postgr.es/m/20170420003611.7r2sdvehesdyiz2i@alap3.anarazel.de
* Workaround for RecoverPreparedTransactions()Simon Riggs2017-04-23
| | | | | | Force overwriteOK = true while we investigate deeper fix Proposed by Tom Lane as temporary measure, accepted by me
* Fix LagTrackerRead() for timeline incrementsSimon Riggs2017-04-23
| | | | | | | | | Bug was masked by error in running 004_timeline_switch.pl that was fixed recently in 7d68f2281a. Detective work by Alvaro Herrera and Tom Lane Author: Thomas Munro
* Fix order of arguments to SubTransSetParent().Tom Lane2017-04-23
| | | | | | | | | | | | | | ProcessTwoPhaseBuffer (formerly StandbyRecoverPreparedTransactions) mixed up the parent and child XIDs when calling SubTransSetParent to record the transactions' relationship in pg_subtrans. Remarkably, analysis by Simon Riggs suggests that this doesn't lead to visible problems (at least, not in non-Assert builds). That might explain why we'd not noticed it before. Nonetheless, it's surely wrong. This code was born broken, so back-patch to all supported branches. Discussion: https://postgr.es/m/20110.1492905318@sss.pgh.pa.us
* Fix TAP infrastructure to support Mingw betterAndrew Dunstan2017-04-23
| | | | | | | archive_command and restore_command need to refer to Windows paths, not Msys virtual file system paths, as postgres is completely unaware of the latter, so prefix them with the Windows path to the virtual file system root. Clean psql and pg_recvlogical output of carriage returns.
* Make PostgresNode.pm check server status more carefully.Tom Lane2017-04-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | PostgresNode blithely ignored the exit status of pg_ctl, and in general made no effort to be sure that the server was running when it should be. This caused it to miss server crashes, which is a serious shortcoming in a test scaffold. Make it complain if pg_ctl fails, and modify the start and stop logic to complain if the server doesn't start, or doesn't stop, when expected. Also, have it turn off the "restart_after_crash" configuration parameter in created clusters, as bitter experience has shown that leaving that on can mask crashes too. We might at some point need variant functions that allow for, eg, server start failure to be expected. But no existing test case appears to want that, and it surely shouldn't be the default behavior. Note that this *will* break the buildfarm, as it will expose known bugs that the previous testing failed to. I'm committing it despite that, to verify that we get the expected failures in the buildfarm not just in manual testing. Back-patch into 9.6 where PostgresNode was introduced. (The 9.6 branch is not expected to show any failures.) Discussion: https://postgr.es/m/21432.1492886428@sss.pgh.pa.us
* Make PostgresNode::append_conf append a newline automatically.Tom Lane2017-04-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Although the documentation for append_conf said clearly that it didn't add a newline, many test authors seem to have forgotten that ... or maybe they just consulted the example at the top of the POD documentation, which clearly shows adding a config entry without bothering to add a trailing newline. The worst part of that is that it works, as long as you don't do it more than once, since the backend isn't picky about whether config files end with newlines. So there's not a strong forcing function reminding test authors not to do it like that. Upshot is that this is a terribly fragile way to go about things, and there's at least one existing test case that is demonstrably broken and not testing what it thinks it is. Let's just make append_conf append a newline, instead; that is clearly way safer than the old definition. I also cleaned up a few call sites that were unnecessarily ugly. (I left things alone in places where it's plausible that additional config lines would need to be added someday.) Back-patch the change in append_conf itself to 9.6 where it was added, as having a definitional inconsistency between branches would obviously be pretty hazardous for back-patching TAP tests. The other changes are just cosmetic and don't need to be back-patched. Discussion: https://postgr.es/m/19751.1492892376@sss.pgh.pa.us
* Require sufficiently modern version of Test::More for TAP testsAndrew Dunstan2017-04-22
| | | | | | Ancient versions of Test::More don't support the note() function used in some TAP tests, so we require the minimum version of the module that does.
* Partially revert commit 536d47bd9d5fce8d91929bee3128fa1d08dbcc57.Tom Lane2017-04-22
| | | | | | | | | | | | | Per buildfarm, the "#ifdef F_SETFD" removed in that commit actually is needed on Windows, because fcntl() isn't available at all on that platform, unless using Cygwin. We could perhaps spell it more like "#ifdef HAVE_FCNTL", or "#ifndef WIN32", but it's not clear that those choices are better. It does seem that we don't need the bogus manual definition of FD_CLOEXEC, though, so keep that change. Discussion: https://postgr.es/m/26254.1492805635@sss.pgh.pa.us
* doc: Update linkPeter Eisentraut2017-04-21
| | | | | | | | The reference "That is the topic of the next section." has been incorrect since the materialized views documentation got inserted between the section "rules-views" and "rules-update". Author: Zertrin <postgres_wiki@zertrin.org>
* Avoid depending on non-POSIX behavior of fcntl(2).Tom Lane2017-04-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The POSIX standard does not say that the success return value for fcntl(F_SETFD) and fcntl(F_SETFL) is zero; it says only that it's not -1. We had several calls that were making the stronger assumption. Adjust them to test specifically for -1 for strict spec compliance. The standard further leaves open the possibility that the O_NONBLOCK flag bit is not the only active one in F_SETFL's argument. Formally, therefore, one ought to get the current flags with F_GETFL and store them back with only the O_NONBLOCK bit changed when trying to change the nonblock state. In port/noblock.c, we were doing the full pushup in pg_set_block but not in pg_set_noblock, which is just weird. Make both of them do it properly, since they have little business making any assumptions about the socket they're handed. The other places where we're issuing F_SETFL are working with FDs we just got from pipe(2), so it's reasonable to assume the FDs' properties are all default, so I didn't bother adding F_GETFL steps there. Also, while pg_set_block deserves some points for trying to do things right, somebody had decided that it'd be even better to cast fcntl's third argument to "long". Which is completely loony, because POSIX clearly says the third argument for an F_SETFL call is "int". Given the lack of field complaints, these missteps apparently are not of significance on any common platforms. But they're still wrong, so back-patch to all supported branches. Discussion: https://postgr.es/m/30882.1492800880@sss.pgh.pa.us
* Change the on-disk format of SCRAM verifiers to conform to RFC 5803.Heikki Linnakangas2017-04-21
| | | | | | | | | | | | | | | | | | | It doesn't make any immediate difference to PostgreSQL, but might as well follow the standard, since one exists. (I looked at RFC 5803 earlier, but didn't fully understand it back then.) The new format uses Base64 instead of hex to encode StoredKey and ServerKey, which makes the verifiers slightly smaller. Using the same encoding for the salt and the keys also means that you only need one encoder/decoder instead of two. Although we have code in the backend to do both, we are talking about teaching libpq how to create SCRAM verifiers for PQencodePassword(), and libpq doesn't currently have any code for hex encoding. Bump catversion, because this renders any existing SCRAM verifiers in pg_authid invalid. Discussion: https://www.postgresql.org/message-id/351ba574-85ea-d9b8-9689-8c928dd0955d@iki.fi
* doc: Fix typoPeter Eisentraut2017-04-21
|
* Remove long-obsolete catering for platforms without F_SETFD/FD_CLOEXEC.Tom Lane2017-04-21
| | | | | | | | | SUSv2 mandates that <fcntl.h> provide both F_SETFD and FD_CLOEXEC, so it seems pretty unlikely that any platforms remain without those. Remove the #ifdef-ery installed by commit 7627b91cd to see if the buildfarm agrees. Discussion: https://postgr.es/m/21444.1492798101@sss.pgh.pa.us