aboutsummaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAge
* Allow ConditionVariable[PrepareTo]Sleep to auto-switch between CVs.Tom Lane2018-01-09
| | | | | | | | | | | | | | | | | | | | | | | | | | The original coding here insisted that callers manually cancel any prepared sleep for one condition variable before starting a sleep on another one. While that's not a huge burden today, it seems like a gotcha that will bite us in future if the use of condition variables increases; anything we can do to make the use of this API simpler and more robust is attractive. Hence, allow these functions to automatically switch their attention to a different CV when required. This is safe for the same reason it was OK for commit aced5a92b to let a broadcast operation cancel any prepared CV sleep: whenever we return to the other test-and-sleep loop, we will automatically re-prepare that CV, paying at most an extra test of that loop's exit condition. Back-patch to v10 where condition variables were introduced. Ordinarily we would probably not back-patch a change like this, but since it does not invalidate any coding pattern that was legal before, it seems safe enough. Furthermore, there's an open bug in replorigin_drop() for which the simplest fix requires this. Even if we chose to fix that in some more complicated way, the hazard would remain that we might back-patch some other bug fix that requires this behavior. Patch by me, reviewed by Thomas Munro. Discussion: https://postgr.es/m/2437.1515368316@sss.pgh.pa.us
* pg_upgrade: prevent check on live cluster from generating errorBruce Momjian2018-01-08
| | | | | | | | | | | Previously an inaccurate but harmless error was generated when running --check on a live server before reporting the servers as compatible. The fix is to split error reporting and exit control in the exec_prog() API. Reported-by: Daniel Westermann Backpatch-through: 10
* Reorder steps in ConditionVariablePrepareToSleep for more safety.Tom Lane2018-01-05
| | | | | | | | | | | | | | | In the admittedly-very-unlikely case that AddWaitEventToSet fails, ConditionVariablePrepareToSleep would error out after already having set cv_sleep_target, which is probably bad, and after having already set cv_wait_event_set, which is very bad. Transaction abort might or might not clean up cv_sleep_target properly; but there is nothing that would be aware that the WaitEventSet wasn't fully constructed, so that all future condition variable sleeps would be broken. We can easily guard against these hazards with slight restructuring. Back-patch to v10 where condition_variable.c was introduced. Discussion: https://postgr.es/m/CAEepm=0NWKehYw7NDoUSf8juuKOPRnCyY3vuaSvhrEWsOTAa3w@mail.gmail.com
* Rewrite ConditionVariableBroadcast() to avoid live-lock.Tom Lane2018-01-05
| | | | | | | | | | | | | | | | | | | | | | | | | | The original implementation of ConditionVariableBroadcast was, per its self-description, "the dumbest way possible". Thomas Munro found out it was a bit too dumb. An awakened process may immediately re-queue itself, if the specific condition it's waiting for is not yet satisfied. If this happens before ConditionVariableBroadcast is able to see the wait queue as empty, then ConditionVariableBroadcast will re-awaken the same process, repeating the cycle. Given unlucky timing this back-and-forth can repeat indefinitely; loops lasting thousands of seconds have been seen in testing. To fix, add our own process to the end of the wait queue to serve as a sentinel, and exit the broadcast loop once our process is not there anymore. There are various special considerations described in the comments, the principal disadvantage being that wakers can no longer be sure whether they awakened a real waiter or just a sentinel. But in practice nobody pays attention to the result of ConditionVariableSignal or ConditionVariableBroadcast anyway, so that problem seems hypothetical. Back-patch to v10 where condition_variable.c was introduced. Tom Lane and Thomas Munro Discussion: https://postgr.es/m/CAEepm=0NWKehYw7NDoUSf8juuKOPRnCyY3vuaSvhrEWsOTAa3w@mail.gmail.com
* pg_upgrade: remove C commentBruce Momjian2018-01-05
| | | | | | Revert another part of 959ee6d267fb24e667fc64e9837a376e236e84a5 . Backpatch-through: 10
* pg_upgrade: revert part of patch for ease of translationBruce Momjian2018-01-05
| | | | | | Revert part of 959ee6d267fb24e667fc64e9837a376e236e84a5 . Backpatch-through: 10
* pg_upgrade: simplify code layout in a few placesBruce Momjian2018-01-05
| | | | Backpatch-through: 9.4 (9.3 didn't need improving)
* Fix failure to delete spill files of aborted transactionsAlvaro Herrera2018-01-05
| | | | | | | | | | | | | | | | | | Logical decoding's reorderbuffer.c may spill transaction files to disk when transactions are large. These are supposed to be removed when they become "too old" by xid; but file removal requires the boundary LSNs of the transaction to be known. The final_lsn is only set when we see the commit or abort record for the transaction, but nothing sets the value for transactions that crash, so the removal code misbehaves -- in assertion-enabled builds, it crashes by a failed assertion. To fix, modify the final_lsn of transactions that don't have a value set, to the LSN of the very latest change in the transaction. This causes the spilled files to be removed appropriately. Author: Atsushi Torikoshi Reviewed-by: Kyotaro HORIGUCHI, Craig Ringer, Masahiko Sawada Discussion: https://postgr.es/m/54e4e488-186b-a056-6628-50628e4e4ebc@lab.ntt.co.jp
* Fix new test case to not be endian-dependent.Tom Lane2018-01-04
| | | | | | Per buildfarm. Discussion: https://postgr.es/m/ec295792-a69f-350f-6287-25a20e8f31d5@gmail.com
* Fix incorrect computations of length of null bitmap in pageinspect.Tom Lane2018-01-04
| | | | | | | | | | | | | | | Instead of using our standard macro for this calculation, this code did it itself ... and got it wrong, leading to incorrect display of the null bitmap in some cases. Noted and fixed by Maksim Milyutin. In passing, remove a uselessly duplicative error check. Errors were introduced in commit d6061f83a; back-patch to 9.6 where that came in. Maksim Milyutin, reviewed by Andrey Borodin Discussion: https://postgr.es/m/ec295792-a69f-350f-6287-25a20e8f31d5@gmail.com
* Revert "Fix isolation test to be less timing-dependent"Alvaro Herrera2018-01-03
| | | | | | | | This reverts commit 2268e6afd596. It turned out that inconsistency in the report is still possible, so go back to the simpler formulation of the test and instead add an alternate expected output. Discussion: https://postgr.es/m/20180103193728.ysqpcp2xjnqpiep7@alvherre.pgsql
* Rename pg_rewind's copy_file_range() to avoid conflict with new linux syscall.Andres Freund2018-01-03
| | | | | | | | | | | | | | | | | Upcoming versions of glibc will contain copy_file_range(2), a wrapper around a new linux syscall for in-kernel copying of data ranges. This conflicts with pg_rewinds function of the same name. Therefore rename pg_rewinds version. As our version isn't a generic copying facility we decided to choose a rewind specific function name. Per buildfarm animal caiman and subsequent discussion with Tom Lane. Author: Andres Freund Discussion: https://postgr.es/m/20180103033425.w7jkljth3e26sduc@alap3.anarazel.de https://postgr.es/m/31122.1514951044@sss.pgh.pa.us Backpatch: 9.5-, where pg_rewind was introduced
* Fix use of config-specific libraries for Windows OpenSSLAndrew Dunstan2018-01-03
| | | | | | | | | | | Commit 614350a3 allowed for an different builds of OpenSSL libraries on Windows, but ignored the fact that the alternative builds don't have config-specific libraries. This patch fixes the Solution file to ask for the correct libraries. per offline discussions with Leonardo Cecchi and Marco Nenciarini, Backpatch to all live branches.
* Make XactLockTableWait work for transactions that are not yet self-lockedAlvaro Herrera2018-01-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | XactLockTableWait assumed that its xid argument has already added itself to the lock table. That assumption led to another assumption that if locking the xid has succeeded but the xid is reported as still in progress, then the input xid must have been a subtransaction. These assumptions hold true for the original uses of this code in locking related to on-disk tuples, but they break down in logical replication slot snapshot building -- in particular, when a standby snapshot logged contains an xid that's already in ProcArray but not yet in the lock table. This leads to assertion failures that can be reproduced all the way back to 9.4, when logical decoding was introduced. To fix, change SubTransGetParent to SubTransGetTopmostTransaction which has a slightly different API: it returns the argument Xid if there is no parent, and it goes all the way to the top instead of moving up the levels one by one. Also, to avoid busy-waiting, add a 1ms sleep to give the other process time to register itself in the lock table. For consistency, change ConditionalXactLockTableWait the same way. Author: Petr Jelínek Discussion: https://postgr.es/m/1B3E32D8-FCF4-40B4-AEF9-5C0E3AC57969@postgrespro.ru Reported-by: Konstantin Knizhnik Diagnosed-by: Stas Kelvich, Petr Jelínek Reviewed-by: Andres Freund, Robert Haas
* Fix isolation test to be less timing-dependentAlvaro Herrera2018-01-03
| | | | | | | | I did this by adding another locking process, which makes the other two wait. This way the output should be stable enough. Per buildfarm and Andres Freund Discussion: https://postgr.es/m/20180103034445.t3utrtrnrevfsghm@alap3.anarazel.de
* Update copyright for 2018Bruce Momjian2018-01-02
| | | | Backpatch-through: certain files through 9.3
* Fix deadlock hazard in CREATE INDEX CONCURRENTLYAlvaro Herrera2018-01-02
| | | | | | | | | | | | | | | | | | | | | | Multiple sessions doing CREATE INDEX CONCURRENTLY simultaneously are supposed to be able to work in parallel, as evidenced by fixes in commit c3d09b3bd23f specifically to support this case. In reality, one of the sessions would be aborted by a misterious "deadlock detected" error. Jeff Janes diagnosed that this is because of leftover snapshots used for system catalog scans -- this was broken by 8aa3e47510b9 keeping track of (registering) the catalog snapshot. To fix the deadlocks, it's enough to de-register that snapshot prior to waiting. Backpatch to 9.4, which introduced MVCC catalog scans. Include an isolationtester spec that 8 out of 10 times reproduces the deadlock with the unpatched code for me (Álvaro). Author: Jeff Janes Diagnosed-by: Jeff Janes Reported-by: Jeremy Finzel Discussion: https://postgr.es/m/CAMa1XUhHjCv8Qkx0WOr1Mpm_R4qxN26EibwCrj0Oor2YBUFUTg%40mail.gmail.com
* In tests, await an LSN no later than the recovery target.Noah Misch2017-12-31
| | | | | | | | | | Otherwise, the test fails with "Timed out while waiting for standby to catch up". This happened rarely, perhaps only when autovacuum wrote WAL between our choosing the recovery target and choosing the LSN to await. Commit b26f7fa6ae2b4e5d64525b3d5bc66a0ddccd9e24 fixed one case of this. Fix two more. Back-patch to 9.6, which introduced the affected test. Discussion: https://postgr.es/m/20180101055227.GA2952815@rfd.leadboat.com
* Properly set base backup backends to active in pg_stat_activityMagnus Hagander2017-12-29
| | | | | | | | | | | | | | | When walsenders were included in pg_stat_activity, only the ones actually streaming WAL were listed as active when they were active. In particular, the connections sending base backups were listed as being idle. Which means that a regular pg_basebackup would show up with one active and one idle connection, when both were active. This patch updates to set all walsenders to active when they are (including those doing very fast things like IDENTIFY_SYSTEM), and then back to idle. Details about exactly what they are doing is available in pg_stat_replication. Patch by me, review by Michael Paquier and David Steele.
* Update relation's stats in pg_class during vacuum full.Teodor Sigaev2017-12-27
| | | | | | | | | | | | | | Hash index depends on estimation of numbers of tuples and pages of relations, incorrect value could be a reason of significantly growing of index. Vacuum full recreates heap and reindex all indexes before renewal stats. The patch fixes that, so indexes will see correct values. Backpatch to v10 only because earlier versions haven't usable hash index and growing of hash index is a single user-visible symptom. Author: Amit Kapila Reviewed-by: Ashutosh Sharma, me Discussion: https://www.postgresql.org/message-id/flat/20171115232922.5tomkxnw3iq6jsg7@inml.weebeastie.net
* Fix UNION/INTERSECT/EXCEPT over no columns.Tom Lane2017-12-22
| | | | | | | | | | | | | | | Since 9.4, we've allowed the syntax "select union select" and variants of that. However, the planner wasn't expecting a no-column set operation and ended up treating the set operation as if it were UNION ALL. Turns out it's trivial to fix in v10 and later; we just need to be careful about not generating a Sort node with no sort keys. However, since a weird corner case like this is never going to be exercised by developers, we'd better have thorough regression tests if we want to consider it supported. Per report from Victor Yegorov. Discussion: https://postgr.es/m/CAGnEbojGJrRSOgJwNGM7JSJZpVAf8xXcVPbVrGdhbVEHZ-BUMw@mail.gmail.com
* Cancel CV sleep during subtransaction abort.Robert Haas2017-12-21
| | | | | | | | | | | | | | Generally, error recovery paths that need to do things like LWLockReleaseAll and pgstat_report_wait_end also need to call ConditionVariableCancelSleep, but AbortSubTransaction was missed. Since subtransaction abort might destroy up the DSM segment that contains the ConditionVariable stored in cv_sleep_target, this can result in a crash for anything using condition variables. Reported and diagnosed by Andres Freund. Discussion: http://postgr.es/m/20171221110048.rxk6464azzl5t2fi@alap3.anarazel.de
* When passing query strings to workers, pass the terminating \0.Robert Haas2017-12-20
| | | | | | | | | Otherwise, when the query string is read, we might trailing garbage beyond the end, unless there happens to be a \0 there by good luck. Report and patch by Thomas Munro. Reviewed by Rafia Sabih. Discussion: http://postgr.es/m/CAEepm=2SJs7X+_vx8QoDu8d1SMEOxtLhxxLNzZun_BvNkuNhrw@mail.gmail.com
* Try again to fix accumulation of parallel worker instrumentation.Robert Haas2017-12-19
| | | | | | | | | | | | | | | | | When a Gather or Gather Merge node is started and stopped multiple times, accumulate instrumentation data only once, at the end, instead of after each execution, to avoid recording inflated totals. Commit 778e78ae9fa51e58f41cbdc72b293291d02d8984, the previous attempt at a fix, instead reset the state after every execution, which worked for the general instrumentation data but had problems for the additional instrumentation specific to Sort and Hash nodes. Report by hubert depesz lubaczewski. Analysis and fix by Amit Kapila, following a design proposal from Thomas Munro, with a comment tweak by me. Discussion: http://postgr.es/m/20171127175631.GA405@depesz.com
* doc: Fix figures in example descriptionPeter Eisentraut2017-12-18
| | | | | | oversight in 244c8b466a743d1ec18a7d841bf42669699b3b56 Reported-by: Blaz Merela <blaz@merela.org>
* Fix bug in cancellation of non-exclusive backup to avoid assertion failure.Fujii Masao2017-12-19
| | | | | | | | | | | | | | | | | | | | | | | | Previously an assertion failure occurred when pg_stop_backup() for non-exclusive backup was aborted while it's waiting for WAL files to be archived. This assertion failure happened in do_pg_abort_backup() which was called when a non-exclusive backup was canceled. do_pg_abort_backup() assumes that there is at least one non-exclusive backup running when it's called. But pg_stop_backup() can be canceled even after it marks the end of non-exclusive backup (e.g., during waiting for WAL archiving). This broke the assumption that do_pg_abort_backup() relies on, and which caused an assertion failure. This commit changes do_pg_abort_backup() so that it does nothing when non-exclusive backup has been already marked as completed. That is, the asssumption is also changed, and do_pg_abort_backup() now can handle even the case where it's called when there is no running backup. Backpatch to 9.6 where SQL-callable non-exclusive backup was added. Author: Masahiko Sawada and Michael Paquier Reviewed-By: Robert Haas and Fujii Masao Discussion: https://www.postgresql.org/message-id/CAD21AoD2L1Fu2c==gnVASMyFAAaq3y-AQ2uEVj-zTCGFFjvmDg@mail.gmail.com
* Fix crashes on plans with multiple Gather (Merge) nodes.Robert Haas2017-12-18
| | | | | | | | | | | | | | | | es_query_dsa turns out to be broken by design, because it supposes that there is only one DSA for the whole query, whereas there is actually one per Gather (Merge) node. For now, work around that problem by setting and clearing the pointer around the sections of code that might need it. It's probably a better idea to get rid of es_query_dsa altogether in favor of having each node keep track individually of which DSA is relevant, but that seems like more than we would want to back-patch. Thomas Munro, reviewed and tested by Andreas Seltenreich, Amit Kapila, and by me. Discussion: http://postgr.es/m/CAEepm=1U6as=brnVvMNixEV2tpi8NuyQoTmO8Qef0-VV+=7MDA@mail.gmail.com
* Avoid and detect SIGPIPE race in TAP tests.Noah Misch2017-12-16
| | | | | | | | | | | Don't write to stdin of a psql process that could have already exited with an authentication failure. Buildfarm members crake and mandrill have failed once by doing so. Ignore SIGPIPE in all TAP tests. Back-patch to v10, where these tests were introduced. Reviewed by Michael Paquier. Discussion: https://postgr.es/m/20171209210203.GC3362632@rfd.leadboat.com
* Perform a lot more sanity checks when freezing tuples.Andres Freund2017-12-14
| | | | | | | | | | | | | | | | The previous commit has shown that the sanity checks around freezing aren't strong enough. Strengthening them seems especially important because the existance of the bug has caused corruption that we don't want to make even worse during future vacuum cycles. The errors are emitted with ereport rather than elog, despite being "should never happen" messages, so a proper error code is emitted. To avoid superflous translations, mark messages as internal. Author: Andres Freund and Alvaro Herrera Reviewed-By: Alvaro Herrera, Michael Paquier Discussion: https://postgr.es/m/20171102112019.33wb7g5wp4zpjelu@alap3.anarazel.de Backpatch: 9.3-
* Fix pruning of locked and updated tuples.Andres Freund2017-12-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously it was possible that a tuple was not pruned during vacuum, even though its update xmax (i.e. the updating xid in a multixact with both key share lockers and an updater) was below the cutoff horizon. As the freezing code assumed, rightly so, that that's not supposed to happen, xmax would be preserved (as a member of a new multixact or xmax directly). That causes two problems: For one the tuple is below the xmin horizon, which can cause problems if the clog is truncated or once there's an xid wraparound. The bigger problem is that that will break HOT chains, which in turn can lead two to breakages: First, failing index lookups, which in turn can e.g lead to constraints being violated. Second, future hot prunes / vacuums can end up making invisible tuples visible again. There's other harmful scenarios. Fix the problem by recognizing that tuples can be DEAD instead of RECENTLY_DEAD, even if the multixactid has alive members, if the update_xid is below the xmin horizon. That's safe because newer versions of the tuple will contain the locking xids. A followup commit will harden the code somewhat against future similar bugs and already corrupted data. Author: Andres Freund, with changes by Alvaro Herrera Reported-By: Daniel Wood Analyzed-By: Andres Freund, Alvaro Herrera, Robert Haas, Peter Geoghegan, Daniel Wood, Yi Wen Wong, Michael Paquier Reviewed-By: Alvaro Herrera, Robert Haas, Michael Paquier Discussion: https://postgr.es/m/E5711E62-8FDF-4DCA-A888-C200BF6B5742@amazon.com https://postgr.es/m/20171102112019.33wb7g5wp4zpjelu@alap3.anarazel.de Backpatch: 9.3-
* Fix walsender timeouts when decoding a large transactionAndrew Dunstan2017-12-14
| | | | | | | | | | | | | | | | | | | | | | | The logical slots have a fast code path for sending data so as not to impose too high a per message overhead. The fast path skips checks for interrupts and timeouts. However, the existing coding failed to consider the fact that a transaction with a large number of changes may take a very long time to be processed and sent to the client. This causes the walsender to ignore interrupts for potentially a long time and more importantly it will result in the walsender being killed due to timeout at the end of such a transaction. This commit changes the fast path to also check for interrupts and only allows calling the fast path when the last keepalive check happened less than half the walsender timeout ago. Otherwise the slower code path will be taken. Backpatched to 9.4 Petr Jelinek, reviewed by Kyotaro HORIGUCHI, Yura Sokolov, Craig Ringer and Robert Haas. Discussion: https://postgr.es/m/e082a56a-fd95-a250-3bae-0fff93832510@2ndquadrant.com
* Fix parallel index scan hang with deleted or half-dead pages.Robert Haas2017-12-13
| | | | | | | | | | The previous coding forgot to release the scan before seizing it again, leading to a lockup. Report by Patrick Hemmer. Diagnosis by Thomas Munro. Patch by Amit Kapila. Discussion: http://postgr.es/m/CAEepm=2xZUcOGP9V0O_G0=2P2wwXwPrkF=upWTCJSisUxMnuSg@mail.gmail.com
* Revert "Fix accumulation of parallel worker instrumentation."Robert Haas2017-12-13
| | | | | | | This reverts commit 778e78ae9fa51e58f41cbdc72b293291d02d8984. Per further discussion, that doesn't seem to be the best possible fix. Discussion: http://postgr.es/m/CAA4eK1LW2aFKzY3=vwvc=t-juzPPVWP2uT1bpx_MeyEqnM+p8g@mail.gmail.com
* Fix commentPeter Eisentraut2017-12-11
| | | | Reported-by: Noah Misch <noah@leadboat.com>
* Fix corner-case coredump in _SPI_error_callback().Tom Lane2017-12-11
| | | | | | | | | | | I noticed that _SPI_execute_plan initially sets spierrcontext.arg = NULL, and only fills it in some time later. If an error were to happen in between, _SPI_error_callback would try to dereference the null pointer. This is unlikely --- there's not much between those points except push-snapshot calls --- but it's clearly not impossible. Tweak the callback to do nothing if the pointer isn't set yet. It's been like this for awhile, so back-patch to all supported branches.
* Fix typoMagnus Hagander2017-12-09
| | | | Reported by Robins Tharakan
* MSVC 2012+: Permit linking to 32-bit, MinGW-built libraries.Noah Misch2017-12-09
| | | | | | | | | | | | | | | | | | Notably, this permits linking to the 32-bit Perl binaries advertised on perl.org, namely Strawberry Perl and ActivePerl. This has a side effect of permitting linking to binaries built with obsolete MSVC versions. By default, MSVC 2012 and later require a "safe exception handler table" in each binary. MinGW-built, 32-bit DLLs lack the relevant exception handler metadata, so linking to them failed with error LNK2026. Restore the semantics of MSVC 2010, which omits the table from a given binary if some linker input lacks metadata. This has no effect on 64-bit builds or on MSVC 2010 and earlier. Back-patch to 9.3 (all supported versions). Reported by Victor Wagner. Discussion: https://postgr.es/m/20160326154321.7754ab8f@wagner.wagner.home
* MSVC: Test whether 32-bit Perl needs -D_USE_32BIT_TIME_T.Noah Misch2017-12-08
| | | | | | | | | | | | | | | | | | | | | Commits 5a5c2feca3fd858e70ea348822595547e6fa6c15 and b5178c5d08ca59e30f9d9428fa6fdb2741794e65 introduced support for modern MSVC-built, 32-bit Perl, but they broke use of MinGW-built, 32-bit Perl distributions like Strawberry Perl and modern ActivePerl. Perl has no robust means to report whether it expects a -D_USE_32BIT_TIME_T ABI, so test this. Back-patch to 9.3 (all supported versions). The chief alternative was a heuristic of adding -D_USE_32BIT_TIME_T when $Config{gccversion} is nonempty. That banks on every gcc-built Perl using the same ABI. gcc could change its default ABI the way MSVC once did, and one could build Perl with gcc and the non-default ABI. The GNU make build system could benefit from a similar test, without which it does not support MSVC-built Perl. For now, just add a comment. Most users taking the special step of building Perl with MSVC probably build PostgreSQL with MSVC. Discussion: https://postgr.es/m/20171130041441.GA3161526@rfd.leadboat.com
* Prohibit identity columns on typed tables and partitionsPeter Eisentraut2017-12-08
| | | | | | | | | | Those cases currently crash and supporting them is more work then originally thought, so we'll just prohibit these scenarios for now. Author: Michael Paquier <michael.paquier@gmail.com> Reviewed-by: Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> Reported-by: Мансур Галиев <gomer94@yandex.ru> Bug: #14866
* Fix mistake in commentPeter Eisentraut2017-12-08
| | | | Reported-by: Masahiko Sawada <sawada.mshk@gmail.com>
* Accept SCRAM channel binding enabled clientsPeter Eisentraut2017-12-08
| | | | | | | | | | | | | Add support to the SCRAM exchange for clients that support channel binding, such as PostgreSQL version 11 and beyond. If such a client encounters a PostgreSQL 10 server that does not support channel binding, it will send a channel binding flag 'y', meaning the client supports channel binding but thinks the server does not. But PostgreSQL 10 erroneously did not accept that flag. This would cause connections to fail if a version 11 client connects to a version 10 server with SCRAM authentication over SSL. Author: Michael Paquier <michael.paquier@gmail.com>
* Apply identity sequence values on COPYPeter Eisentraut2017-12-08
| | | | | | | | | | | A COPY into a table should apply identity sequence values just like it does for ordinary defaults. This was previously forgotten, leading to null values being inserted, which in turn would fail because identity columns have not-null constraints. Author: Michael Paquier <michael.paquier@gmail.com> Reported-by: Steven Winfield <steven.winfield@cantabcapital.com> Bug: #14952
* Report failure to start a background worker.Robert Haas2017-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a worker is flagged as BGW_NEVER_RESTART and we fail to start it, or if it is not marked BGW_NEVER_RESTART but is terminated before startup succeeds, what BgwHandleStatus should be reported? The previous code really hadn't considered this possibility (as indicated by the comments which ignore it completely) and would typically return BGWH_NOT_YET_STARTED, but that's not a good answer, because then there's no way for code using GetBackgroundWorkerPid() to tell the difference between a worker that has not started but will start later and a worker that has not started and will never be started. So, when this case happens, return BGWH_STOPPED instead. Update the comments to reflect this. The preceding fix by itself is insufficient to fix the problem, because the old code also didn't send a notification to the process identified in bgw_notify_pid when startup failed. That might've been technically correct under the theory that the status of the worker was BGWH_NOT_YET_STARTED, because the status would indeed not change when the worker failed to start, but now that we're more usefully reporting BGWH_STOPPED, a notification is needed. Without these fixes, code which starts background workers and then uses the recommended APIs to wait for those background workers to start would hang indefinitely if the postmaster failed to fork a worker. Amit Kapila and Robert Haas Discussion: http://postgr.es/m/CAA4eK1KDfKkvrjxsKJi3WPyceVi3dH1VCkbTJji2fuwKuB=3uw@mail.gmail.com
* doc: Update memory requirements for FOPPeter Eisentraut2017-12-05
| | | | Reported-by: Dave Page <dpage@pgadmin.org>
* Fix accumulation of parallel worker instrumentation.Robert Haas2017-12-05
| | | | | | | | | | | | | When a Gather or Gather Merge node is started and stopped multiple times, the old code wouldn't reset the shared state between executions, potentially resulting in dramatically inflated instrumentation data for nodes beneath it. (The per-worker instrumentation ended up OK, I think, but the overall totals were inflated.) Report by hubert depesz lubaczewski. Analysis and fix by Amit Kapila, reviewed and tweaked a bit by me. Discussion: http://postgr.es/m/20171127175631.GA405@depesz.com
* Mark assorted variables PGDLLIMPORT.Robert Haas2017-12-05
| | | | | | | | | This makes life easier for extension authors who wish to support Windows. Brian Cloutier, slightly amended by me. Discussion: http://postgr.es/m/CAJCy68fscdNhmzFPS4kyO00CADkvXvEa-28H-OtENk-pa2OTWw@mail.gmail.com
* doc: Turn on generate.consistent.ids parameterPeter Eisentraut2017-12-05
| | | | | This ensures that automatically generated HTML anchors don't change in every build.
* Clean up assorted messiness around AllocateDir() usage.Tom Lane2017-12-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a couple of low-probability bugs that could lead to reporting an irrelevant errno value (and hence possibly a wrong SQLSTATE) concerning directory-open or file-open failures. It also fixes places where we took shortcuts in reporting such errors, either by using elog instead of ereport or by using ereport but forgetting to specify an errcode. And it eliminates a lot of just plain redundant error-handling code. In service of all this, export fd.c's formerly-static function ReadDirExtended, so that external callers can make use of the coding pattern dir = AllocateDir(path); while ((de = ReadDirExtended(dir, path, LOG)) != NULL) if they'd like to treat directory-open failures as mere LOG conditions rather than errors. Also fix FreeDir to be a no-op if we reach it with dir == NULL, as such a coding pattern would cause. Then, remove code at many call sites that was throwing an error or log message for AllocateDir failure, as ReadDir or ReadDirExtended can handle that job just fine. Aside from being a net code savings, this gets rid of a lot of not-quite-up-to-snuff reports, as mentioned above. (In some places these changes result in replacing a custom error message such as "could not open tablespace directory" with more generic wording "could not open directory", but it was agreed that the custom wording buys little as long as we report the directory name.) In some other call sites where we can't just remove code, change the error reports to be fully project-style-compliant. Also reorder code in restoreTwoPhaseData that was acquiring a lock between AllocateDir and ReadDir; in the unlikely but surely not impossible case that LWLockAcquire changes errno, AllocateDir failures would be misreported. There is no great value in opening the directory before acquiring TwoPhaseStateLock, so just do it in the other order. Also fix CheckXLogRemoved to guarantee that it preserves errno, as quite a number of call sites are implicitly assuming. (Again, it's unlikely but I think not impossible that errno could change during a SpinLockAcquire. If so, this function was broken for its own purposes as well as breaking callers.) And change a few places that were using not-per-project-style messages, such as "could not read directory" when "could not open directory" is more correct. Back-patch the exporting of ReadDirExtended, in case we have occasion to back-patch some fix that makes use of it; it's not needed right now but surely making it global is pretty harmless. Also back-patch the restoreTwoPhaseData and CheckXLogRemoved fixes. The rest of this is essentially cosmetic and need not get back-patched. Michael Paquier, with a bit of additional work by me Discussion: https://postgr.es/m/CAB7nPqRpOCxjiirHmebEFhXVTK7V5Jvw4bz82p7Oimtsm3TyZA@mail.gmail.com
* Support boolean columns in functional-dependency statistics.Tom Lane2017-12-04
| | | | | | | | | | | | | | | | | | | | | | | There's no good reason that the multicolumn stats stuff shouldn't work on booleans. But it looked only for "Var = pseudoconstant" clauses, and it will seldom find those for boolean Vars, since earlier phases of planning will fold "boolvar = true" or "boolvar = false" to just "boolvar" or "NOT boolvar" respectively. Improve dependencies_clauselist_selectivity() to recognize such clauses as equivalent to equality restrictions. This fixes a failure of the extended stats mechanism to apply in a case reported by Vitaliy Garnashevich. It's not a complete solution to his problem because the bitmap-scan costing code isn't consulting extended stats where it should, but that's surely an independent issue. In passing, improve some comments, get rid of a NumRelids() test that's redundant with the preceding bms_membership() test, and fix dependencies_clauselist_selectivity() so that estimatedclauses actually is a pure output argument as stated by its API contract. Back-patch to v10 where this code was introduced. Discussion: https://postgr.es/m/73a4936d-2814-dc08-ed0c-978f76f435b0@gmail.com
* Fix non-GNU makefiles for AIX make.Noah Misch2017-11-30
| | | | | | Invoking the Makefile without an explicit target was building every possible target instead of just the "all" target. Back-patch to 9.3 (all supported versions).