aboutsummaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAge
...
* Fix missing abort checks in pg_backup_directory.c.Tom Lane2016-05-29
| | | | | | | | | | | | | Parallel restore from directory format failed to respond to control-C in a timely manner, because there were no checkAborting() calls in the code path that reads data from a file and sends it to the backend. If any worker was in the midst of restoring data for a large table, you'd just have to wait. This fix doesn't do anything for the problem of aborting a long-running server-side command, but at least it fixes things for data transfers. Back-patch to 9.3 where parallel restore was introduced.
* Remove pg_dump/parallel.c's useless "aborting" flag.Tom Lane2016-05-29
| | | | | | | | | | | | | | This was effectively dead code, since the places that tested it could not be reached after we entered the on-exit-cleanup routine that would set it. It seems to have been a leftover from a design in which error abort would try to send fresh commands to the workers --- a design which could never have worked reliably, of course. Since the flag is not cross-platform, it complicates reasoning about the code's behavior, which we could do without. Although this is effectively just cosmetic, back-patch anyway, because there are some actual bugs in the vicinity of this behavior. Discussion: <15583.1464462418@sss.pgh.pa.us>
* Lots of comment-fixing, and minor cosmetic cleanup, in pg_dump/parallel.c.Tom Lane2016-05-28
| | | | | | | | | | | | | | | | | | | | | | | | The commentary in this file was in extremely sad shape. The author(s) had clearly never heard of the project convention that a function header comment should provide an API spec of some sort for that function. Much of it was flat out wrong, too --- maybe it was accurate when written, but if so it had not been updated to track subsequent code revisions. Rewrite and rearrange to try to bring it up to speed, and annotate some of the places where more work is needed. (I've refrained from actually fixing anything of substance ... yet.) Also, rename a couple of functions for more clarity as to what they do, do some very minor code rearrangement, remove some pointless Asserts, fix an incorrect Assert in readMessageFromPipe, and add a missing socket close in one error exit from pgpipe(). The last would be a bug if we tried to continue after pgpipe() failure, but since we don't, it's just cosmetic at present. Although this is only cosmetic, back-patch to 9.3 where parallel.c was added. It's sufficiently invasive that it'll pose a hazard for future back-patching if we don't. Discussion: <25239.1464386067@sss.pgh.pa.us>
* Clean up thread management in parallel pg_dump for Windows.Tom Lane2016-05-27
| | | | | | | | | | | | | | | | | | | | | | | | Since we start the worker threads with _beginthreadex(), we should use _endthreadex() to terminate them. We got this right in the normal-exit code path, but not so much during an error exit from a worker. In addition, be sure to apply CloseHandle to the thread handle after each thread exits. It's not clear that these oversights cause any user-visible problems, since the pg_dump run is about to terminate anyway. Still, it's clearly better to follow Microsoft's API specifications than ignore them. Also a few cosmetic cleanups in WaitForTerminatingWorkers(), including being a bit less random about where to cast between uintptr_t and HANDLE, and being sure to clear the worker identity field for each dead worker (not that false matches should be possible later, but let's be careful). Original observation and patch by Armin Schöffmann, cosmetic improvements by Michael Paquier and me. (Armin's patch also included closing sockets in ShutdownWorkersHard(), but that's been dealt with already in commit df8d2d8c4.) Back-patch to 9.3 where parallel pg_dump was introduced. Discussion: <zarafa.570306bd.3418.074bf1420d8f2ba2@root.aegaeon.de>
* Be more predictable about reporting "lock timeout" vs "statement timeout".Tom Lane2016-05-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If both timeout indicators are set when we arrive at ProcessInterrupts, we've historically just reported "lock timeout". However, some buildfarm members have been observed to fail isolationtester's timeouts test by reporting "lock timeout" when the statement timeout was expected to fire first. The cause seems to be that the process is allowed to sleep longer than expected (probably due to heavy machine load) so that the lock timeout happens before we reach the point of reporting the error, and then this arbitrary tiebreak rule does the wrong thing. We can improve matters by comparing the scheduled timeout times to decide which error to report. I had originally proposed greatly reducing the 1-second window between the two timeouts in the test cases. On reflection that is a bad idea, at least for the case where the lock timeout is expected to fire first, because that would assume that it takes negligible time to get from statement start to the beginning of the lock wait. Thus, this patch doesn't completely remove the risk of test failures on slow machines. Empirically, however, the case this handles is the one we are seeing in the buildfarm. The explanation may be that the other case requires the scheduler to take the CPU away from a busy process, whereas the case fixed here only requires the scheduler to not give the CPU back right away to a process that has been woken from a multi-second sleep (and, perhaps, has been swapped out meanwhile). Back-patch to 9.3 where the isolationtester timeouts test was added. Discussion: <8693.1464314819@sss.pgh.pa.us>
* Make pg_dump error cleanly with -j against hot standbyMagnus Hagander2016-05-26
| | | | | | | Getting a synchronized snapshot is not supported on a hot standby node, and is by default taken when using -j with multiple sessions. Trying to do so still failed, but with a server error that would also go in the log. Instead, proprely detect this case and give a better error message.
* Make pg_dump behave more sanely when built without HAVE_LIBZ.Tom Lane2016-05-26
| | | | | | | | | | | | | | | | | | For some reason the code to emit a warning and switch to uncompressed output was placed down in the guts of pg_backup_archiver.c. This is definitely too late in the case of parallel operation (and I rather wonder if it wasn't too late for other purposes as well). Put it in pg_dump.c's option-processing logic, which seems a much saner place. Also, the default behavior with custom or directory output format was to emit the warning telling you the output would be uncompressed. This seems unhelpful, so silence that case. Back-patch to 9.3 where parallel dump was introduced. Kyotaro Horiguchi, adjusted a bit by me Report: <20160526.185551.242041780.horiguchi.kyotaro@lab.ntt.co.jp>
* In Windows pg_dump, ensure idle workers will shut down during error exit.Tom Lane2016-05-26
| | | | | | | | | | | | | | The Windows coding of ShutdownWorkersHard() thought that setting termEvent was sufficient to make workers exit after an error. But that only helps if a worker is busy and passes through checkAborting(). An idle worker will just sit, resulting in pg_dump failing to exit until the user gives up and hits control-C. We should close the write end of the command pipe so that idle workers will see socket EOF and exit, as the Unix coding was already doing. Back-patch to 9.3 where parallel pg_dump was introduced. Kyotaro Horiguchi
* Ensure that backends see up-to-date statistics for shared catalogs.Tom Lane2016-05-25
| | | | | | | | | | | | | | | | | | | | | | | | | | Ever since we split the statistics collector's reports into per-database files (commit 187492b6c2e8cafc), backends have been seeing stale statistics for shared catalogs. This is because the inquiry message only prompts the collector to write the per-database file for the requesting backend's own database. Stats for shared catalogs are in a separate file for "DB 0", which didn't get updated. In normal operation this was partially masked by the fact that the autovacuum launcher would send an inquiry message at least once per autovacuum_naptime that asked for "DB 0"; so the shared-catalog stats would never be more than a minute out of date. However the problem becomes very obvious with autovacuum disabled, as reported by Peter Eisentraut. To fix, redefine the semantics of inquiry messages so that both the specified DB and DB 0 will be dumped. (This might seem a bit inefficient, but we have no good way to know whether a backend's transaction will look at shared-catalog stats, so we have to read both groups of stats whenever we request stats. Sending two inquiry messages would definitely not be better.) Back-patch to 9.3 where the bug was introduced. Report: <56AD41AC.1030509@gmx.net>
* Fix broken error handling in parallel pg_dump/pg_restore.Tom Lane2016-05-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the original design for parallel dump, worker processes reported errors by sending them up to the master process, which would print the messages. This is unworkably fragile for a couple of reasons: it risks deadlock if a worker sends an error at an unexpected time, and if the master has already died for some reason, the user will never get to see the error at all. Revert that idea and go back to just always printing messages to stderr. This approach means that if all the workers fail for similar reasons (eg, bad password or server shutdown), the user will see N copies of that message, not only one as before. While that's slightly annoying, it's certainly better than not seeing any message; not to mention that we shouldn't assume that only the first failure is interesting. An additional problem in the same area was that the master failed to disable SIGPIPE (at least until much too late), which meant that sending a command to an already-dead worker would cause the master to crash silently. That was bad enough in itself but was made worse by the total reliance on the master to print errors: even if the worker had reported an error, you would probably not see it, depending on timing. Instead disable SIGPIPE right after we've forked the workers, before attempting to send them anything. Additionally, the master relies on seeing socket EOF to realize that a worker has exited prematurely --- but on Windows, there would be no EOF since the socket is attached to the process that includes both the master and worker threads, so it remains open. Make archive_close_connection() close the worker end of the sockets so that this acts more like the Unix case. It's not perfect, because if a worker thread exits without going through exit_nicely() the closures won't happen; but that's not really supposed to happen. This has been wrong all along, so back-patch to 9.3 where parallel dump was introduced. Report: <2458.1450894615@sss.pgh.pa.us>
* Fetch XIDs atomically during vac_truncate_clog().Tom Lane2016-05-24
| | | | | | | | | | Because vac_update_datfrozenxid() updates datfrozenxid and datminmxid in-place, it's unsafe to assume that successive reads of those values will give consistent results. Fetch each one just once to ensure sane behavior in the minimum calculation. Noted while reviewing Alexander Korotkov's patch in the same area. Discussion: <8564.1464116473@sss.pgh.pa.us>
* Avoid consuming an XID during vac_truncate_clog().Tom Lane2016-05-24
| | | | | | | | | | | | | | | | | | | vac_truncate_clog() uses its own transaction ID as the comparison point in a sanity check that no database's datfrozenxid has already wrapped around "into the future". That was probably fine when written, but in a lazy vacuum we won't have assigned an XID, so calling GetCurrentTransactionId() causes an XID to be assigned when otherwise one would not be. Most of the time that's not a big problem ... but if we are hard up against the wraparound limit, consuming XIDs during antiwraparound vacuums is a very bad thing. Instead, use ReadNewTransactionId(), which not only avoids this problem but is in itself a better comparison point to test whether wraparound has already occurred. Report and patch by Alexander Korotkov. Back-patch to all versions. Report: <CAPpHfdspOkmiQsxh-UZw2chM6dRMwXAJGEmmbmqYR=yvM7-s6A@mail.gmail.com>
* Support IndexElem in raw_expression_tree_walker().Tom Lane2016-05-23
| | | | | | | | | Needed for cases in which INSERT ... ON CONFLICT appears inside a recursive CTE item. Per bug #14153 from Thomas Alton. Patch by Peter Geoghegan, slightly adjusted by me Report: <20160521232802.22598.13537@wrigleys.postgresql.org>
* Fix latent crash in do_text_output_multiline().Tom Lane2016-05-23
| | | | | | | | | | | | | | | | | | | | | do_text_output_multiline() would fail (typically with a null pointer dereference crash) if its input string did not end with a newline. Such cases do not arise in our current sources; but it certainly could happen in future, or in extension code's usage of the function, so we should fix it. To fix, replace "eol += len" with "eol = text + len". While at it, make two cosmetic improvements: mark the input string const, and rename the argument from "text" to "txt" to dodge pgindent strangeness (since "text" is a typedef name). Even though this problem is only latent at present, it seems like a good idea to back-patch the fix, since it's a very simple/safe patch and it's not out of the realm of possibility that we might in future back-patch something that expects sane behavior from do_text_output_multiline(). Per report from Hao Lee. Report: <CAGoxFiFPAGyPAJLcFxTB5cGhTW2yOVBDYeqDugYwV4dEd1L_Ag@mail.gmail.com>
* Fix bogus commentsAlvaro Herrera2016-05-12
| | | | | | Some comments mentioned XLogReplayBuffer, but there's no such function: that was an interim name for a function that got renamed to XLogReadBufferForRedo, before commit 2c03216d831160 was pushed.
* Fix obsolete commentAlvaro Herrera2016-05-12
|
* Fix infer_arbiter_indexes() to not barf on system columns.Tom Lane2016-05-11
| | | | | | | | | | | | | While it could be argued that rejecting system column mentions in the ON CONFLICT list is an unsupported feature, falling over altogether just because the table has a unique index on OID is indubitably a bug. As far as I can tell, fixing infer_arbiter_indexes() is sufficient to make ON CONFLICT (oid) actually work, though making a regression test for that case is problematic because of the impossibility of setting the OID counter to a known value. Minor cosmetic cleanups along with the bug fix.
* Fix assorted missing infrastructure for ON CONFLICT.Tom Lane2016-05-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | subquery_planner() failed to apply expression preprocessing to the arbiterElems and arbiterWhere fields of an OnConflictExpr. No doubt the theory was that this wasn't necessary because we don't actually try to execute those expressions; but that's wrong, because it results in failure to match to index expressions or index predicates that are changed at all by preprocessing. Per bug #14132 from Reynold Smith. Also add pullup_replace_vars processing for onConflictWhere. Perhaps it's impossible to have a subquery reference there, but I'm not exactly convinced; and even if true today it's a failure waiting to happen. Also add some comments to other places where one or another field of OnConflictExpr is intentionally ignored, with explanation as to why it's okay to do so. Also, catalog/dependency.c failed to record any dependency on the named constraint in ON CONFLICT ON CONSTRAINT, allowing such a constraint to be dropped while rules exist that depend on it, and allowing pg_dump to dump such a rule before the constraint it refers to. The normal execution path managed to error out reasonably for a dangling constraint reference, but ruleutils.c dumped core; so in addition to fixing the omission, add a protective check in ruleutils.c, since we can't retroactively add a dependency in existing databases. Back-patch to 9.5 where this code was introduced. Report: <20160510190350.2608.48667@wrigleys.postgresql.org>
* Fix autovacuum for shared relationsAlvaro Herrera2016-05-10
| | | | | | | | | | | | | | | | | | | | The table-skipping logic in autovacuum would fail to consider that multiple workers could be processing the same shared catalog in different databases. This normally wouldn't be a problem: firstly because autovacuum workers not for wraparound would simply ignore tables in which they cannot acquire lock, and secondly because most of the time these tables are small enough that even if multiple for-wraparound workers are stuck in the same catalog, they would be over pretty quickly. But in cases where the catalogs are severely bloated it could become a problem. Backpatch all the way back, because the problem has been there since the beginning. Reported by Ondřej Světlík Discussion: https://www.postgresql.org/message-id/572B63B1.3030603%40flexibee.eu https://www.postgresql.org/message-id/572A1072.5080308%40flexibee.eu
* Stamp 9.5.3.REL9_5_3Tom Lane2016-05-09
|
* Translation updatesPeter Eisentraut2016-05-09
| | | | | Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 7a7a803d44fad7952cf6b1a1da29df2b06b1b380
* Distrust external OpenSSL clients; clear err queuePeter Eisentraut2016-05-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | OpenSSL has an unfortunate tendency to mix per-session state error handling with per-thread error handling. This can cause problems when programs that link to libpq with OpenSSL enabled have some other use of OpenSSL; without care, one caller of OpenSSL may cause problems for the other caller. Backend code might similarly be affected, for example when a third party extension independently uses OpenSSL without taking the appropriate precautions. To fix, don't trust other users of OpenSSL to clear the per-thread error queue. Instead, clear the entire per-thread queue ahead of certain I/O operations when it appears that there might be trouble (these I/O operations mostly need to call SSL_get_error() to check for success, which relies on the queue being empty). This is slightly aggressive, but it's pretty clear that the other callers have a very dubious claim to ownership of the per-thread queue. Do this is both frontend and backend code. Finally, be more careful about clearing our own error queue, so as to not cause these problems ourself. It's possibly that control previously did not always reach SSLerrmessage(), where ERR_get_error() was supposed to be called to clear the queue's earliest code. Make sure ERR_get_error() is always called, so as to spare other users of OpenSSL the possibility of similar problems caused by libpq (as opposed to problems caused by a third party OpenSSL library like PHP's OpenSSL extension). Again, do this is both frontend and backend code. See bug #12799 and https://bugs.php.net/bug.php?id=68276 Based on patches by Dave Vitek and Peter Eisentraut. From: Peter Geoghegan <pg@bowt.ie>
* Fix SSL testsPeter Eisentraut2016-05-06
| | | | | These were accidentally broken by the great backpatching of 331828b754378733cb5c2e49227603e7354e4e39.
* Fix pg_upgrade to not fail when new-cluster TOAST rules differ from old.Tom Lane2016-05-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch essentially reverts commit 4c6780fd17aa43ed, in favor of a much simpler solution for the case where the new cluster would choose to create a TOAST table but the old cluster doesn't have one: just don't create a TOAST table. The existing code failed in at least two different ways if the situation arose: (1) ALTER TABLE RESET didn't grab an exclusive lock, so that the lock sanity check in create_toast_table failed; (2) pg_upgrade did not provide a pg_type OID for the new toast table, so that the crosscheck in TypeCreate failed. While both these problems were introduced by later patches, they show that the hack being used to cause TOAST table creation is overwhelmingly fragile (and untested). I also note that before the TypeCreate crosscheck was added, the code would have resulted in assigning an indeterminate pg_type OID to the toast table, possibly causing a later OID conflict in that catalog; so that it didn't really work even when committed. If we simply don't create a TOAST table, there will only be a problem if the code tries to store a tuple that's wider than a page, and field compression isn't sufficient to get it under a page. Given that the TOAST creation threshold is intended to be about a quarter of a page, it's very hard to believe that cross-version differences in the do-we-need-a-toast- table heuristic could result in an observable problem. So let's just follow the old version's conclusion about whether a TOAST table is needed. (If we ever do change needs_toast_table() so much that this conclusion doesn't apply, we can devise a solution at that time, and hopefully do it in a less klugy way than 4c6780fd17aa43ed did.) Back-patch to 9.3, like the previous patch. Discussion: <8110.1462291671@sss.pgh.pa.us>
* Fix possible read past end of string in to_timestamp().Tom Lane2016-05-06
| | | | | | | | | | | | | | | | | | | | | | | to_timestamp() handles the TH/th format codes by advancing over two input characters, whatever those are. It failed to notice whether there were two characters available to be skipped, making it possible to advance the pointer past the end of the input string and keep on parsing. A similar risk existed in the handling of "Y,YYY" format: it would advance over three characters after the "," whether or not three characters were available. In principle this might be exploitable to disclose contents of server memory. But the security team concluded that it would be very hard to use that way, because the parsing loop would stop upon hitting any zero byte, and TH/th format codes can't be consecutive --- they have to follow some other format code, which would have to match whatever data is there. So it seems impractical to examine memory very much beyond the end of the input string via this bug; and the input string will always be in local memory not in disk buffers, making it unlikely that anything very interesting is close to it in a predictable way. So this doesn't quite rise to the level of needing a CVE. Thanks to Wolf Roediger for reporting this bug.
* Update time zone data files to tzdata release 2016d.Tom Lane2016-05-05
| | | | | | | DST law changes in Russia (Magadan, Tomsk regions) and Venezuela. Historical corrections for Russia. There are new zone names Europe/Kirov and Asia/Tomsk reflecting the fact that these regions now have different time zone histories from adjacent regions.
* Fix mishandling of equivalence-class tests in parameterized plans.Tom Lane2016-04-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Given a three-or-more-way equivalence class, such as X.Y = Y.Y = Z.Z, it was possible for the planner to omit one of the quals needed to enforce that all members of the equivalence class are actually equal. This only happened in the case of a parameterized join node for two of the relations, that is a plan tree like Nested Loop -> Scan X -> Nested Loop -> Scan Y -> Scan Z Filter: Z.Z = X.X The eclass machinery normally expects to apply X.X = Y.Y when those two relations are joined, but in this shape of plan tree they aren't joined until the top node --- and, if the lower nested loop is marked as parameterized by X, the top node will assume that the relevant eclass condition(s) got pushed down into the lower node. On the other hand, the scan of Z assumes that it's only responsible for constraining Z.Z to match any one of the other eclass members. So one or another of the required quals sometimes fell between the cracks, depending on whether consideration of the eclass in get_joinrel_parampathinfo() for the lower nested loop chanced to generate X.X = Y.Y or X.X = Z.Z as the appropriate constraint there. If it generated the latter, it'd erroneously suppose that the Z scan would take care of matters. To fix, force X.X = Y.Y to be generated and applied at that join node when this case occurs. This is *extremely* hard to hit in practice, because various planner behaviors conspire to mask the problem; starting with the fact that the planner doesn't really like to generate a parameterized plan of the above shape. (It might have been impossible to hit it before we tweaked things to allow this plan shape for star-schema cases.) Many thanks to Alexander Kirkouski for submitting a reproducible test case. The bug can be demonstrated in all branches back to 9.2 where parameterized paths were introduced, so back-patch that far.
* Fix comment whitespace in VS2105 patchAndrew Dunstan2016-04-29
| | | | per gripe from Michael Paquier.
* Fix typo in VS2015 patchAndrew Dunstan2016-04-29
| | | | reported by Christian Ullrich
* Support building with Visual Studio 2015Andrew Dunstan2016-04-29
| | | | | | | | | | | Adjust the way we detect the locale. As a result the minumum Windows version supported by VS2015 and later is Windows Vista. Add some tweaks to remove new compiler warnings. Remove documentation references to the now obsolete msysGit. Michael Paquier, somewhat edited by me, reviewed by Christian Ullrich. Backpatch to 9.5
* Remember asking for feedback during walsender shutdown.Andres Freund2016-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | Since 5a991ef8 we're explicitly asking for feedback from the receiving side when shutting down walsender, if there's not yet replicated data. Unfortunately we didn't remember (i.e. set waiting_for_ping_response to true) having asked for feedback, leading to scenarios in which replies were requested at a high frequency. I can't reproduce this problem on my laptop, I think that's because the problem requires a significant TCP window to manifest due to the !pq_is_send_pending() condition. But since this clearly is a bug, let's fix it. There's quite possibly more wrong than just this though. While fiddling with WalSndDone(), I rewrote a hard to understand comment about looking at the flush vs. the write position. Reported-By: Nick Cleaton, Magnus Hagander Author: Nick Cleaton Discussion: CAFgz3kus=rC_avEgBV=+hRK5HYJ8vXskJRh8yEAbahJGTzF2VQ@mail.gmail.com CABUevExsjROqDcD0A2rnJ6HK6FuKGyewJr3PL12pw85BHFGS2Q@mail.gmail.com Backpatch: 9.4, were 5a991ef8 introduced the use of feedback messages during shutdown.
* Adjust DatumGetBool macro, this time for sure.Tom Lane2016-04-28
| | | | | | | | | | | | | | | Commit 23a41573c attempted to fix the DatumGetBool macro to ignore bits in a Datum that are to the left of the actual bool value. But it did that by casting the Datum to bool; and on compilers that use C99 semantics for bool, that ends up being a whole-word test, not a 1-byte test. This seems to be the true explanation for contrib/seg failing in VS2015. To fix, use GET_1_BYTE() explicitly. I think in the previous patch, I'd had some idea of not having to commit to bool being exactly 1 byte wide, but regardless of what the compiler's bool is, boolean columns and Datums are certainly 1 byte wide. The previous fix was (eventually) back-patched into all active versions, so do likewise with this one.
* Impose a full barrier in generic-xlc.h atomics functions.Noah Misch2016-04-26
| | | | | | | | | | | | pg_atomic_compare_exchange_*_impl() were providing only the semantics of an acquire barrier. Buildfarm members hornet and mandrill revealed this deficit beginning with commit 008608b9d51061b1f598c197477b3dc7be9c4a64. While we have no report of symptoms in 9.5, we can't rule out the possibility of certain compilers, hardware, or extension code relying on these functions' specified barrier semantics. Back-patch to 9.5, where commit b64d92f1a5602c55ee8b27a7ac474f03b7aee340 introduced atomics. Reviewed by Andres Freund.
* Rename strtoi() to strtoint().Tom Lane2016-04-23
| | | | | | | | | | | | NetBSD has seen fit to invent a libc function named strtoi(), which conflicts with the long-established static functions of the same name in datetime.c and ecpg's interval.c. While muttering darkly about intrusions on application namespace, we'll rename our functions to avoid the conflict. Back-patch to all supported branches, since this would affect attempts to build any of them on recent NetBSD. Thomas Munro
* Add putenv support for msvcrt from Visual Studio 2013Magnus Hagander2016-04-22
| | | | | | This was missed when VS 2013 support was added. Michael Paquier
* Fix unexpected side-effects of operator_precedence_warning.Tom Lane2016-04-21
| | | | | | | | | | | | | | | The implementation of that feature involves injecting nodes into the raw parsetree where explicit parentheses appear. Various places in parse_expr.c that test to see "is this child node of type Foo" need to look through such nodes, else we'll get different behavior when operator_precedence_warning is on than when it is off. Note that we only need to handle this when testing untransformed child nodes, since the AEXPR_PAREN nodes will be gone anyway after transformExprRecurse. Per report from Scott Ribe and additional code-reading. Back-patch to 9.5 where this feature was added. Report: <ED37E303-1B0A-4CD8-8E1E-B9C4C2DD9A17@elevated-dev.com>
* Fix planner failure with full join in RHS of left join.Tom Lane2016-04-21
| | | | | | | | | | | | | | | | | | | | | | | | | Given a left join containing a full join in its righthand side, with the left join's joinclause referencing only one side of the full join (in a non-strict fashion, so that the full join doesn't get simplified), the planner could fail with "failed to build any N-way joins" or related errors. This happened because the full join was seen as overlapping the left join's RHS, and then recent changes within join_is_legal() caused that function to conclude that the full join couldn't validly be formed. Rather than try to rejigger join_is_legal() yet more to allow this, I think it's better to fix initsplan.c so that the required join order is explicit in the SpecialJoinInfo data structure. The previous coding there essentially ignored full joins, relying on the fact that we don't flatten them in the joinlist data structure to preserve their ordering. That's sufficient to prevent a wrong plan from being formed, but as this example shows, it's not sufficient to ensure that the right plan will be formed. We need to work a bit harder to ensure that the right plan looks sane according to the SpecialJoinInfos. Per bug #14105 from Vojtech Rylko. This was apparently induced by commit 8703059c6 (though now that I've seen it, I wonder whether there are related cases that could have failed before that); so back-patch to all active branches. Unfortunately, that patch also went into 9.0, so this bug is a regression that won't be fixed in that branch.
* Improve TranslateSocketError() to handle more Windows error codes.Tom Lane2016-04-21
| | | | | | The coverage was rather lean for cases that bind() or listen() might return. Add entries for everything that there's a direct equivalent for in the set of Unix errnos that elog.c has heard of.
* Remove dead code in win32.h.Tom Lane2016-04-21
| | | | | | | | | | | There's no longer a need for the MSVC-version-specific code stanza that forcibly redefines errno code symbols, because since commit 73838b52 we're unconditionally redefining them in the stanza before this one anyway. Now it's merely confusing and ugly, so get rid of it; and improve the comment that explains what's going on here. Although this is just cosmetic, back-patch anyway since I'm intending to back-patch some less-cosmetic changes in this same hunk of code.
* Provide errno-translation wrappers around bind() and listen() on Windows.Tom Lane2016-04-21
| | | | | | | | | Fix Windows builds to report something useful rather than "could not bind IPv4 socket: No error" when bind() fails. Back-patch of commits d1b7d4877b9a71f4 and 22989a8e34168f57. Discussion: <4065.1452450340@sss.pgh.pa.us>
* Fix ruleutils.c's dumping of ScalarArrayOpExpr containing an EXPR_SUBLINK.Tom Lane2016-04-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | When we shoehorned "x op ANY (array)" into the SQL syntax, we created a fundamental ambiguity as to the proper treatment of a sub-SELECT on the righthand side: perhaps what's meant is to compare x against each row of the sub-SELECT's result, or perhaps the sub-SELECT is meant as a scalar sub-SELECT that delivers a single array value whose members should be compared against x. The grammar resolves it as the former case whenever the RHS is a select_with_parens, making the latter case hard to reach --- but you can get at it, with tricks such as attaching a no-op cast to the sub-SELECT. Parse analysis would throw away the no-op cast, leaving a parsetree with an EXPR_SUBLINK SubLink directly under a ScalarArrayOpExpr. ruleutils.c was not clued in on this fine point, and would naively emit "x op ANY ((SELECT ...))", which would be parsed as the first alternative, typically leading to errors like "operator does not exist: text = text[]" during dump/reload of a view or rule containing such a construct. To fix, emit a no-op cast when dumping such a parsetree. This might well be exactly what the user wrote to get the construct accepted in the first place; and even if she got there with some other dodge, it is a valid representation of the parsetree. Per report from Karl Czajkowski. He mentioned only a case involving RLS policies, but actually the problem is very old, so back-patch to all supported branches. Report: <20160421001832.GB7976@moraine.isi.edu>
* Honor PGCTLTIMEOUT environment variable for pg_regress' startup wait.Tom Lane2016-04-20
| | | | | | | | | | | | | In commit 2ffa86962077c588 we made pg_ctl recognize an environment variable PGCTLTIMEOUT to set the default timeout for starting and stopping the postmaster. However, pg_regress uses pg_ctl only for the "stop" end of that; it has bespoke code for starting the postmaster, and that code has historically had a hard-wired 60-second timeout. Further buildfarm experience says it'd be a good idea if that timeout were also controlled by PGCTLTIMEOUT, so let's make it so. Like the previous patch, back-patch to all active branches. Discussion: <13969.1461191936@sss.pgh.pa.us>
* Fix memory leak and other bugs in ginPlaceToPage() & subroutines.Tom Lane2016-04-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 36a35c550ac114ca turned the interface between ginPlaceToPage and its subroutines in gindatapage.c and ginentrypage.c into a royal mess: page-update critical sections were started in one place and finished in another place not even in the same file, and the very same subroutine might return having started a critical section or not. Subsequent patches band-aided over some of the problems with this design by making things even messier. One user-visible resulting problem is memory leaks caused by the need for the subroutines to allocate storage that would survive until ginPlaceToPage calls XLogInsert (as reported by Julien Rouhaud). This would not typically be noticeable during retail index updates. It could be visible in a GIN index build, in the form of memory consumption swelling to several times the commanded maintenance_work_mem. Another rather nasty problem is that in the internal-page-splitting code path, we would clear the child page's GIN_INCOMPLETE_SPLIT flag well before entering the critical section that it's supposed to be cleared in; a failure in between would leave the index in a corrupt state. There were also assorted coding-rule violations with little immediate consequence but possible long-term hazards, such as beginning an XLogInsert sequence before entering a critical section, or calling elog(DEBUG) inside a critical section. To fix, redefine the API between ginPlaceToPage() and its subroutines by splitting the subroutines into two parts. The "beginPlaceToPage" subroutine does what can be done outside a critical section, including full computation of the result pages into temporary storage when we're going to split the target page. The "execPlaceToPage" subroutine is called within a critical section established by ginPlaceToPage(), and it handles the actual page update in the non-split code path. The critical section, as well as the XLOG insertion call sequence, are both now always started and finished in ginPlaceToPage(). Also, make ginPlaceToPage() create and work in a short-lived memory context to eliminate the leakage problem. (Since a short-lived memory context had been getting created in the most common code path in the subroutines, this shouldn't cause any noticeable performance penalty; we're just moving the overhead up one call level.) In passing, fix a bunch of comments that had gone unmaintained throughout all this klugery. Report: <571276DD.5050303@dalibo.com>
* Further reduce the number of semaphores used under --disable-spinlocks.Tom Lane2016-04-18
| | | | | | | | | | Per discussion, there doesn't seem to be much value in having NUM_SPINLOCK_SEMAPHORES set to 1024: under any scenario where you are running more than a few backends concurrently, you really had better have a real spinlock implementation if you want tolerable performance. And 1024 semaphores is a sizable fraction of the system-wide SysV semaphore limit on many platforms. Therefore, reduce this setting's default value to 128 to make it less likely to cause out-of-semaphores problems.
* Fix possible crash in ALTER TABLE ... REPLICA IDENTITY USING INDEX.Tom Lane2016-04-15
| | | | | | | | | | | Careless coding added by commit 07cacba983ef79be could result in a crash or a bizarre error message if someone tried to select an index on the OID column as the replica identity index for a table. Back-patch to 9.4 where the feature was introduced. Discussion: CAKJS1f8TQYgTRDyF1_u9PVCKWRWz+DkieH=U7954HeHVPJKaKg@mail.gmail.com David Rowley
* Fix memory leak in GIN index scans.Tom Lane2016-04-15
| | | | | | | | | | | The code had a query-lifespan memory leak when encountering GIN entries that have posting lists (rather than posting trees, ie, there are a relatively small number of heap tuples containing this index key value). With a suitable data distribution this could add up to a lot of leakage. Problem seems to have been introduced by commit 36a35c550, so back-patch to 9.4. Julien Rouhaud
* Remove trailing commas in enums.Andres Freund2016-04-14
| | | | | These aren't valid C89. Found thanks to gcc's -Wc90-c99-compat. These exist in differing places in most supported branches.
* Fix core dump in ReorderBufferRestoreChange on alignment-picky platforms.Tom Lane2016-04-14
| | | | | | | | | | | | | | | | | | | | | | | | When re-reading an update involving both an old tuple and a new tuple from disk, reorderbuffer.c was careless about whether the new tuple is suitably aligned for direct access --- in general, it isn't. We'd missed seeing this in the buildfarm because the contrib/test_decoding tests exercise this code path only a few times, and by chance all of those cases have old tuples with length a multiple of 4, which is usually enough to make the access to the new tuple's t_len safe. For some still-not-entirely-clear reason, however, Debian's sparc build gets a bus error, as reported by Christoph Berg; perhaps it's assuming 8-byte alignment of the pointer? The lack of previous field reports is probably because you need all of these conditions to trigger a crash: an alignment-picky platform (not Intel), a transaction large enough to spill to disk, an update within that xact that changes a primary-key field and has an odd-length old tuple, and of course logical decoding tracing the transaction. Avoid the alignment assumption by using memcpy instead of fetching t_len directly, and add a test case that exposes the crash on picky platforms. Back-patch to 9.4 where the bug was introduced. Discussion: <20160413094117.GC21485@msg.credativ.de>
* Adjust datatype of ReplicationState.acquired_by.Tom Lane2016-04-14
| | | | | | | | | | | | | | | | It was declared as "pid_t", which would be fine except that none of the places that printed it in error messages took any thought for the possibility that it's not equivalent to "int". This leads to warnings on some buildfarm members, and could possibly lead to actually wrong error messages on those platforms. There doesn't seem to be any very good reason not to just make it "int"; it's only ever assigned from MyProcPid, which is int. If we want to cope with PIDs that are wider than int, this is not the place to start. Also, fix the comment, which seems to perhaps be a leftover from a time when the field was only a bool? Per buildfarm. Back-patch to 9.5 which has same issue.
* Fix pg_dump so pg_upgrade'ing an extension with simple opfamilies works.Tom Lane2016-04-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As reported by Michael Feld, pg_upgrade'ing an installation having extensions with operator families that contain just a single operator class failed to reproduce the extension membership of those operator families. This caused no immediate ill effects, but would create problems when later trying to do a plain dump and restore, because the seemingly-not-part-of- the-extension operator families would appear separately in the pg_dump output, and then would conflict with the families created by loading the extension. This has been broken ever since extensions were introduced, and many of the standard contrib extensions are affected, so it's a bit astonishing nobody complained before. The cause of the problem is a perhaps-ill-considered decision to omit such operator families from pg_dump's output on the grounds that the CREATE OPERATOR CLASS commands could recreate them, and having explicit CREATE OPERATOR FAMILY commands would impede loading the dump script into pre-8.3 servers. Whatever the merits of that decision when 8.3 was being written, it looks like a poor tradeoff now. We can fix the pg_upgrade problem simply by removing that code, so that the operator families are dumped explicitly (and then will be properly made to be part of their extensions). Although this fixes the behavior of future pg_upgrade runs, it does nothing to clean up existing installations that may have improperly-linked operator families. Given the small number of complaints to date, maybe we don't need to worry about providing an automated solution for that; anyone who needs to clean it up can do so with manual "ALTER EXTENSION ADD OPERATOR FAMILY" commands, or even just ignore the duplicate-opfamily errors they get during a pg_restore. In any case we need this fix. Back-patch to all supported branches. Discussion: <20228.1460575691@sss.pgh.pa.us>