aboutsummaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAge
...
* Relax some asserts in merge join costing codeDavid Rowley2020-10-20
| | | | | | | | | | | | | | | | | | | | | | | | In the planner, it was possible, given an extreme enough case containing a large number of joins for the number of estimated rows to become infinite. This could cause problems in initial_cost_mergejoin() where we perform some calculations based on those row estimates. A problem case, presented by Onder Kalaci showed an Assert failure from an Assert checking outerstartsel <= outerendsel. In his test case this was effectively NaN <= Inf, which is false. The NaN outerstartsel came from multiplying the infinite outer_path_rows by 0.0. In master, this problem was fixed by a90c950fc, however, that fix was too invasive for the backbranches. Here we just relax the Asserts to allow them to pass. The worst that appears to happen from this is that we show NaN cost values and infinite row estimates in EXPLAIN. add_path() would have had a hard time doing anything useful with such costs, but that does not really matter as if the row estimates were even close to accurate, such plan would not complete this side of the heat death of the universe. Reported-by: Onder Kalaci Backpatch: 9.5 to 13 Discussion: https://postgr.es/m/DM6PR21MB1211FF360183BCA901B27F04D80B0@DM6PR21MB1211.namprd21.prod.outlook.com
* Fix potential memory leak in pgcryptoMichael Paquier2020-10-19
| | | | | | | | | | | | | | | | | | | | | When allocating a EVP context, it would have been possible to leak some memory allocated directly by OpenSSL, that PostgreSQL lost track of if the initialization of the context allocated failed. The cleanup can be done with EVP_MD_CTX_destroy(). Note that EVP APIs exist since OpenSSL 0.9.7 and we have in the tree equivalent implementations for older versions since ce9b75d (code removed with 9b7cd59a as of 10~). However, in 9.5 and 9.6, the existing code makes use of EVP_MD_CTX_destroy() and EVP_MD_CTX_create() without an equivalent implementation when building the tree with OpenSSL 0.9.6 or older, meaning that this code is in reality broken with such versions since it got introduced in e2838c5. As we have heard no complains about that, it does not seem worth bothering with in 9.5 and 9.6, so I have left that out for simplicity. Author: Michael Paquier Discussion: https://postgr.es/m/20201015072212.GC2305@paquier.xyz Backpatch-through: 9.5
* Doc: caution against misuse of 'now' and related datetime literals.Tom Lane2020-10-17
| | | | | | | | | | | | | | Section 8.5.1.4, which defines these literals, made only a vague reference to the fact that they might be evaluated too soon to be safe in non-interactive contexts. Provide a more explicit caution against misuse. Also, generalize the wording in the related tip in section 9.9.4: while it clearly described this problem, it implied (or really, stated outright) that the problem only applies to table DEFAULT clauses. Per gripe from Tijs van Dam. Back-patch to all supported branches. Discussion: https://postgr.es/m/c2LuRv9BiRT3bqIo5mMQiVraEXey_25B4vUn0kDqVqilwOEu_iVF1tbtvLnyQK7yDG3PFaz_GxLLPil2SDkj1MCObNRVaac-7j1dVdFERk8=@thalex.com
* Update time zone data files to tzdata release 2020c.Tom Lane2020-10-16
| | | | | | DST law changes in Morocco, Canadian Yukon, Fiji, Macquarie Island, Casey Station (Antarctica). Historical corrections for France, Hungary, Monaco.
* Sync our copy of the timezone library with IANA release tzcode2020c.Tom Lane2020-10-16
| | | | | | | | | | | | | | | This changes zic's default output format from "-b fat" to "-b slim". We were already using "slim" in v13/HEAD, so those branches drop the explicit -b switch in the Makefiles. Instead, add an explicit "-b fat" in v12 and before, so that we don't change the output file format in those branches. (This is perhaps excessively conservative, but we decided not to do so in a12079109, and I'll stick with that.) Other non-cosmetic changes are to drop support for zic's long-obsolete "-y" switch, and to ensure that strftime() does not change errno unless it fails. As usual with tzcode changes, back-patch to all supported branches.
* Add missing error check in pgcrypto/crypt-md5.c.Tom Lane2020-10-16
| | | | | | | | | | | | | | | In theory, the second px_find_digest call in px_crypt_md5 could fail even though the first one succeeded, since resource allocation is required. Don't skip testing for a failure. (If one did happen, the likely result would be a crash rather than clean recovery from an OOM failure.) The code's been like this all along, so back-patch to all supported branches. Daniel Gustafsson Discussion: https://postgr.es/m/AA8D6FE9-4AB2-41B4-98CB-AE64BA668C03@yesql.se
* Doc: tweak column widths in synchronous-commit-matrix table.Tom Lane2020-10-16
| | | | | | | | | | | | Commit a97e85f2b caused "exceed the available area" warnings in PDF builds. Fine-tune colwidth values to avoid that. Back-patch to 9.6, like the prior patch. (This is of dubious value before v13, since we were far from free of such warnings in older branches. But we might as well keep the SGML looking the same in all branches.) Per buildfarm.
* llvmjit: Work around bug in LLVM 3.9 causing crashes after 72559438f92.Andres Freund2020-10-15
| | | | | | | | | | | | | | Unfortunately in LLVM 3.9 LLVMGetAttributeCountAtIndex(func, index) crashes when called with an index that has 0 attributes. Since there's no way to work around this in the C API, add a small C++ wrapper doing so. The only reason this didn't fail before 72559438f92 is that there always are function attributes... Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20201016001254.w2nfj7gd74jmb5in@alap3.anarazel.de Backpatch: 11-, like 72559438f92
* pg_upgrade: remove C99 compiler req. from commit 3c0471b5fdBruce Momjian2020-10-15
| | | | | | | | | | | | | | This commit required support for inline variable definition, which is not a requirement. RELEASE NOTE AUTHOR: the author of commit 3c0471b5fd (pg_upgrade/tablespaces) was Justin Pryzby, not me. Reported-by: Andres Freund Discussion: https://postgr.es/m/20201016001959.h24fkywfubkv2pc5@alap3.anarazel.de Backpatch-through: 9.5
* pg_upgrade: generate check error for left-over new tablespaceBruce Momjian2020-10-15
| | | | | | | | | | | | | Previously, if pg_upgrade failed, and the user recreated the cluster but did not remove the new cluster tablespace directory, a later pg_upgrade would fail since the new tablespace directory would already exists. This adds error reporting for this during check. Reported-by: Justin Pryzby Discussion: https://postgr.es/m/20200925005531.GJ23631@telsasoft.com Backpatch-through: 9.5
* llvmjit: Also copy parameter / return value attributes from template functions.Andres Freund2020-10-15
| | | | | | | | | | | | | | | | | | | | | Previously we only copied the function attributes. That caused problems at least on s390x: Because we didn't copy the 'zeroext' attribute for ExecAggTransReparent()'s *IsNull parameters, expressions invoking it didn't ensure that the upper bytes of the registers were zeroed. In the - relatively rare - cases where not, ExecAggTransReparent() wrongly ended up in the newValueIsNull branch due to the register not being zero. Subsequently causing a crash. It's quite possible that this would cause problems on other platforms, and in other places than just ExecAggTransReparent() on s390x. Thanks to Christoph (and the Debian project) for providing me with access to a s390x machine, allowing me to debug this. Reported-By: Christoph Berg Author: Andres Freund Discussion: https://postgr.es/m/20201015083246.kie5726xerdt3ael@alap3.anarazel.de Backpatch: 11-, where JIT was added
* doc: improve description of synchronous_commit modesBruce Momjian2020-10-15
| | | | | | | | | | | | Previously it wasn't clear exactly what each of the synchronous_commit modes accomplished. This clarifies that, and adds a table describing it. Only backpatched through 9.6 since 9.5 doesn't have all the options. Reported-by: kghost0@gmail.com Discussion: https://postgr.es/m/159741195522.14321.13812604195366728976@wrigleys.postgresql.org Backpatch-through: 9.6
* In the postmaster, rely on the signal infrastructure to block signals.Tom Lane2020-10-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | POSIX sigaction(2) can be told to block a set of signals while a signal handler executes. Make use of that instead of manually blocking and unblocking signals in the postmaster's signal handlers. This should save a few cycles, but more importantly it prevents recursive invocation of signal handlers when many signals arrive in close succession. (Assuming that the platform's signal infrastructure is designed to avoid consuming stack space in that case, but this is demonstrably true at least on Linux.) The existing code has been seen to recurse to the point of stack overflow, either in the postmaster or in a forked-off child. Back-patch of commit 9abb2bfc0. At the time, we'd only seen excess postmaster stack consumption in the buildfarm; but we now have a user report of it, and that commit has aged enough to have a fair amount of confidence that it doesn't break anything. This still doesn't change anything about the way that it works on Windows. Perhaps someone else would like to fix that? Per bug #16673 from David Geier. Back-patch to 9.6. Although the problem exists in principle before that, we've only seen it actually materialize in connection with heavy use of parallel workers, so it doesn't seem necessary to do anything in 9.5; and the relevant code is different there, too. Discussion: https://postgr.es/m/16673-d278c604f8e34ec0@postgresql.org Discussion: https://postgr.es/m/14878.1570820201@sss.pgh.pa.us
* doc: Mention that toast_tuple_target affects also column marked as Main.Fujii Masao2020-10-15
| | | | | | | | | | | | Previously it was documented that toast_tuple_target affected column marked as only External or Extended. But this description is not correct and toast_tuple_target affects also column marked as Main. Back-patch to v11 where toast_tuple_target reloption was introduced. Author: Shinya Okano Reviewed-by: Tatsuhito Kasahara, Fujii Masao Discussion: https://postgr.es/m/93f46e311a67422e89e770d236059817@oss.nttdata.com
* Paper over regression failures in infinite_recurse() on PPC64 Linux.Tom Lane2020-10-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our infinite_recurse() test to verify sane stack-overrun behavior is affected by a bug of the Linux kernel on PPC64: it will get SIGSEGV if it receives a signal when the stack depth is (a) over 1MB and (b) within a few kB of filling the current physical stack allocation. See https://bugzilla.kernel.org/show_bug.cgi?id=205183. Since this test is a bit time-consuming and we run it in parallel with test scripts that do a lot of DDL, it can be expected to get an sinval catchup interrupt at some point, leading to failure if the timing is wrong. This has caused more than 100 buildfarm failures over the past year or so. While a fix exists for the kernel bug, it might be years before that propagates into all production kernels, particularly in some of the older distros we have in the buildfarm. For now, let's just back off and not run this test on Linux PPC64; that loses nothing in test coverage so far as our own code is concerned. To do that, split this test into a new script infinite_recurse.sql and skip the test when the platform name is powerpc64...-linux-gnu. Back-patch to v12. Branches before that have not been seen to get this failure. No doubt that's because the "errors" test was not run in parallel with other tests before commit 798070ec0, greatly reducing the odds of an sinval catchup being necessary. I also back-patched 3c8553547 into v12, just so the new regression script would look the same in all branches having it. Discussion: https://postgr.es/m/3479046.1602607848@sss.pgh.pa.us Discussion: https://postgr.es/m/20190723162703.GM22387%40telsasoft.com
* Fix GiST buffering build to work when there are included columns.Tom Lane2020-10-12
| | | | | | | | | | | | | | gistRelocateBuildBuffersOnSplit did not get the memo about which attribute count to use. This could lead to a crash if there were included columns and buffering build was chosen. (Because there are random page-split decisions elsewhere in GiST index build, the crashes are not entirely deterministic.) Back-patch to v12 where GiST gained support for included columns. Pavel Borisov Discussion: https://postgr.es/m/CALT9ZEECCV5m7wvxg46PC-7x-EybUmnpupBGhSFMoAAay+r6HQ@mail.gmail.com
* Fix memory leak when guc.c decides a setting can't be applied now.Tom Lane2020-10-12
| | | | | | | | | | | | | | | | | | | | The prohibitValueChange code paths in set_config_option(), which are executed whenever we re-read a PGC_POSTMASTER variable from postgresql.conf, neglected to free anything before exiting. Thus we'd leak the proposed new value of a PGC_STRING variable, as noted by BoChen in bug #16666. For all variable types, if the check hook creates an "extra" chunk, we'd also leak that. These are malloc not palloc chunks, so there is no mechanism for recovering the leaks before process exit. Fortunately, the values are typically not very large, meaning you'd have to go through an awful lot of SIGHUP configuration-reload cycles to make the leakage amount to anything. Still, for a long-lived postmaster process it could potentially be a problem. Oversight in commit 2594cf0e8. Back-patch to all supported branches. Discussion: https://postgr.es/m/16666-2c41a4eec61b03e1@postgresql.org
* Fix optimization hazard in gram.y's makeOrderedSetArgs(), redux.Tom Lane2020-10-07
| | | | | | | | | | | | | | | | It appears that commit cf63c641c, which intended to prevent misoptimization of the result-building step in makeOrderedSetArgs, didn't go far enough: buildfarm member hornet's version of xlc is now optimizing back to the old, broken behavior in which list_length(directargs) is fetched only after list_concat() has changed that value. I'm not entirely convinced whether that's an undeniable compiler bug or whether it can be justified by a sufficiently aggressive interpretation of C sequence points. So let's just change the code to make it harder to misinterpret. Back-patch to all supported versions, just in case. Discussion: https://postgr.es/m/1830491.1601944935@sss.pgh.pa.us
* Rethink recent fix for pg_dump's handling of extension config tables.Tom Lane2020-10-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 3eb3d3e78 was a few bricks shy of a load: while it correctly set the table's "interesting" flag when deciding to dump the data of an extension config table, it was not correct to clear that flag if we concluded we shouldn't dump the data. This led to the crash reported in bug #16655, because in fact we'll traverse dumpTableSchema anyway for all extension tables (to see if they have user-added seclabels or RLS policies). The right thing to do is to force "interesting" true in makeTableDataInfo, and otherwise leave the flag alone. (Doing it there is more future-proof in case additional calls are added, and it also avoids setting the flag unnecessarily if that function decides the table is non-dumpable.) This investigation also showed that while only the --inserts code path had an obvious failure in the case considered by 3eb3d3e78, the COPY code path also has a problem with not having loaded table subsidiary data. That causes fmtCopyColumnList to silently return an empty string instead of the correct column list. That accidentally mostly works, which perhaps is why we didn't notice this before. It would only fail if the restore column order is different from the dump column order, which only happens in weird inheritance cases, so it's not surprising nobody had hit the case with an extension config table. Nonetheless, it's a bug, and it goes a long way back, not just to v12 where the --inserts code path started to have a problem with this. In hopes of catching such cases a bit sooner in future, add some Asserts that "interesting" has been set in both dumpTableData and dumpTableSchema. Adjust the test case added by 3eb3d3e78 so that it checks the COPY rather than INSERT form of that bug, allowing it to detect the longer-standing symptom. Per bug #16655 from Cameron Daniel. Back-patch to all supported branches. Discussion: https://postgr.es/m/16655-5c92d6b3a9438137@postgresql.org Discussion: https://postgr.es/m/18048b44-3414-b983-8c7c-9165b177900d@2ndQuadrant.com
* pg_upgrade: remove pre-8.4 code and >= 8.4 checkBruce Momjian2020-10-06
| | | | | | | | | | We only support upgrading from >= 8.4 so no need for this code or tests. Reported-by: Magnus Hagander Discussion: https://postgr.es/m/CABUevEx-D0PNVe00tkeQRGennZQwDtBJn=493MJt-x6sppbUxA@mail.gmail.com Backpatch-through: 9.5
* pg_upgrade; change major version comparisons to use <=, not <Bruce Momjian2020-10-06
| | | | | | This makes checking for older major versions more consistent. Backpatch-through: 9.5
* Build EC members for child join rels in the right memory context.Tom Lane2020-10-06
| | | | | | | | | | | | | | | | | This patch prevents crashes or wrong plans when partition-wise joins are considered during GEQO planning, as a consequence of the EquivalenceClass data structures becoming corrupt after a GEQO context reset. A remaining problem is that successive GEQO cycles will make multiple copies of the required EC members, since add_child_join_rel_equivalences has no idea that such members might exist already. For now we'll just live with that. The lack of field complaints of crashes suggests that this is a mighty little-used situation. Back-patch to v12 where this code was introduced. Discussion: https://postgr.es/m/1683100.1601860653@sss.pgh.pa.us
* doc: show functions returning record types and use of ROWS FROMBruce Momjian2020-10-05
| | | | | | | | | | | | | Previously it was unclear exactly how ROWS FROM behaved and how to cast the data types of columns returned by FROM functions. Also document that only non-OUT record functions can have their columns cast to data types. Reported-by: guyren@gmail.com Discussion: https://postgr.es/m/158638264419.662.2482095087061084020@wrigleys.postgresql.org Backpatch-through: 9.5
* docs: clarify the interaction of clientcert and cert auth.Bruce Momjian2020-10-05
| | | | | | This is the first paragraph change of master-only commit 253f1025da. Backpatch-through: PG 12-13 only
* Fix two latent(?) bugs in equivclass.c.Tom Lane2020-10-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | get_eclass_for_sort_expr() computes expr_relids and nullable_relids early on, even though they won't be needed unless we make a new EquivalenceClass, which we often don't. Aside from the probably-minor inefficiency, there's a memory management problem: these bitmapsets will be built in the caller's context, leading to dangling pointers if that is shorter-lived than root->planner_cxt. This would be a live bug if get_eclass_for_sort_expr() could be called with create_it = true during GEQO join planning. So far as I can find, the core code never does that, but it's hard to be sure that no extensions do, especially since the comments make it clear that that's supposed to be a supported case. Fix by not computing these values until we've switched into planner_cxt to build the new EquivalenceClass. generate_join_implied_equalities() uses inner_rel->relids to look up relevant eclasses, but it ought to be using nominal_inner_relids. This is presently harmless because a child RelOptInfo will always have exactly the same eclass_indexes as its topmost parent; but that might not be true forever, and anyway it makes the code confusing. The first of these is old (introduced by me in f3b3b8d5b), so back-patch to all supported branches. The second only dates to v13, but we might as well back-patch it to keep the code looking similar across branches. Discussion: https://postgr.es/m/1508010.1601832581@sss.pgh.pa.us
* Improve stability of identity.sql regression test.Tom Lane2020-10-04
| | | | | | | | | | | | | | I noticed while trying to run the regression tests under a low geqo_threshold that one query on information_schema.columns had unstable (as in, variable from one run to the next) output order. This is pretty unsurprising given the complexity of the underlying plan. Interestingly, of this test's three nigh-identical queries on information_schema.columns, the other two already had ORDER BY clauses guaranteeing stable output. Let's make this one look the same. Back-patch to v10 where this test was added. We've not heard field reports of the test failing, but this experience shows that it can happen when testing under even slightly unusual conditions.
* doc: libpq connection options can override command-line flagsBruce Momjian2020-10-02
| | | | | | | | Reported-by: Alexander Lakhin Discussion: https://postgr.es/m/16486-b9c93d71c02c4907@postgresql.org Backpatch-through: 9.5
* doc: clarify the use of ssh port forwardingBruce Momjian2020-10-02
| | | | | | | | Reported-by: karimelghazouly@gmail.com Discussion: https://postgr.es/m/159854511172.24991.4373145230066586863@wrigleys.postgresql.org Backpatch-through: 9.5
* Put back explicit setting of replication values within TAP tests.Tom Lane2020-10-01
| | | | | | | | | | | | | | Commit 151c0c5f7 neglected the possibility that a TEMP_CONFIG file would explicitly set max_wal_senders=0; as indeed buildfarm member thorntail does, so that it can test wal_level=minimal in other test suites. Hence, rather than assuming that max_wal_senders=10 will prevail if we say nothing, set it explicitly. Set max_replication_slots=10 explicitly too, just to be safe. Back-patch to v10, like the previous patch. Discussion: https://postgr.es/m/723911.1601417626@sss.pgh.pa.us
* Fix incorrect assertion on number of array dimensions.Heikki Linnakangas2020-10-01
| | | | | | | | | This has been wrong ever since the support for multi-dimensional arrays as PL/python function arguments and return values was introduced in commit 94aceed317. Backpatch-through: 10 Discussion: https://www.postgresql.org/message-id/61647b8e-961c-0362-d5d3-c8a18f4a7ec6%40iki.fi
* Reword partitioning error messageAlvaro Herrera2020-09-30
| | | | | | | | | | The error message about columns in the primary key not including all of the partition key was unclear; reword it. Backpatch all the way to pg11, where it appeared. Reported-by: Nagaraj Raj <nagaraj.sf@yahoo.com> Discussion: https://postgr.es/m/64062533.78364.1601415362244@mail.yahoo.com
* Fix handling of BC years in to_date/to_timestamp.Tom Lane2020-09-30
| | | | | | | | | | | | | | | | | | | | | | | Previously, a conversion such as to_date('-44-02-01','YYYY-MM-DD') would result in '0045-02-01 BC', as the code attempted to interpret the negative year as BC, but failed to apply the correction needed for our internal handling of BC years. Fix the off-by-one problem. Also, arrange for the combination of a negative year and an explicit "BC" marker to cancel out and produce AD. This is how the negative-century case works, so it seems sane to do likewise. Continue to read "year 0000" as 1 BC. Oracle would throw an error, but we've accepted that case for a long time so I'm hesitant to change it in a back-patch. Per bug #16419 from Saeed Hubaishan. Back-patch to all supported branches. Dar Alathar-Yemen and Tom Lane Discussion: https://postgr.es/m/16419-d8d9db0a7553f01b@postgresql.org
* Doc: Improve clarity on partitioned table limitationsDavid Rowley2020-09-30
| | | | | | | | | | | Explicitly mention that primary key constraints are also included in the limitation that the constraint columns must be a superset of the partition key columns. Wording suggestion from Tom Lane. Discussion: https://postgr.es/m/64062533.78364.1601415362244@mail.yahoo.com Backpatch-through: 11, where unique constraints on partitioned tables were added
* Remove obsolete replication settings within TAP tests.Tom Lane2020-09-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PostgresNode.pm set "max_wal_senders = 5" for replication testing, but this seems to be slightly too low for our current test suite. Slower buildfarm members frequently report "number of requested standby connections exceeds max_wal_senders" failures, due to old walsenders not exiting instantaneously. Usually, the test does not fail overall because of automatic walreceiver restart, but sometimes the failure becomes visible; and in any case such retries slow down the test. That value came in with commit 89ac7004d, but was soon obsoleted by f6d6d2920, which raised the built-in default from zero to 10; so that PostgresNode.pm is actually setting it to less than the conservative built-in default. That seems pretty pointless, so let's remove the special setting and let the default prevail, in hopes of making the TAP tests more robust. Likewise, the setting "max_replication_slots = 5" is obsolete and can be removed. While here, reverse-engineer a comment about why we're choosing less-than-default values for some other settings. (Note: before v12, max_wal_senders counted against max_connections so that the latter setting also needs some fiddling with.) Back-patch to v10 where the subscription tests were added. It's likely that the older branches aren't pushing the boundaries of max_wal_senders, but I'm disinclined to spend time trying to figure out exactly when it started to be a problem. Discussion: https://postgr.es/m/723911.1601417626@sss.pgh.pa.us
* Fix memory leak in plpgsql's CALL processing.Tom Lane2020-09-29
| | | | | | | | | | | | | | | | | | | | | When executing a CALL or DO in a non-atomic context (i.e., not inside a function or query), plpgsql creates a new plan each time through, as a rather hacky solution to some resource management issues. But it failed to free this plan until exit of the current procedure or DO block, resulting in serious memory bloat in procedures that called other procedures many times. Fix by remembering to free the plan, and by being more honest about restoring the previous state (otherwise, recursive procedure calls have a problem). There was also a smaller leak associated with recalculation of the "target" list of output variables. Fix that by using the statement- lifespan context to hold non-permanent values. Back-patch to v11 where procedures were introduced. Pavel Stehule and Tom Lane Discussion: https://postgr.es/m/CAFj8pRDiiU1dqym+_P4_GuTWm76knJu7z9opWayBJTC0nQGUUA@mail.gmail.com
* Archive timeline history files in standby if archive_mode is set to "always".Fujii Masao2020-09-29
| | | | | | | | | | | | | | | | | | | | Previously the standby server didn't archive timeline history files streamed from the primary even when archive_mode is set to "always", while it archives the streamed WAL files. This could cause the PITR to fail because there was no required timeline history file in the archive. The cause of this issue was that walreceiver didn't mark those files as ready for archiving. This commit makes walreceiver mark those streamed timeline history files as ready for archiving if archive_mode=always. Then the archiver process archives the marked timeline history files. Back-patch to all supported versions. Reported-by: Grigory Smolkin Author: Grigory Smolkin, Fujii Masao Reviewed-by: David Zhang, Anastasia Lubennikova Discussion: https://postgr.es/m/54b059d4-2b48-13a4-6f43-95a087c92367@postgrespro.ru
* Fix progress reporting of REINDEX CONCURRENTLYMichael Paquier2020-09-29
| | | | | | | | | | | | | | | | This addresses a couple of issues with the so-said subject: - Report the correct parent relation with the index actually being rebuilt or validated. Previously, the command status remained set to the last index created for the progress of the index build and validation, which would be incorrect when working on a table that has more than one index. - Use the correct phase when waiting before the drop of the old indexes. Previously, this was reported with the same status as when waiting before the old indexes are marked as dead. Author: Matthias van de Meent, Michael Paquier Discussion: https://postgr.es/m/CAEze2WhqFgcwe1_tv=sFYhLWV2AdpfukumotJ6JNcAOQs3jufg@mail.gmail.com Backpatch-through: 12
* Assign collations in partition bound expressions.Tom Lane2020-09-28
| | | | | | | | | | Failure to do this can result in errors during evaluation of the bound expression, as illustrated by the new regression test. Back-patch to v12 where the ability for partition bounds to be expressions was added. Discussion: https://postgr.es/m/CAJV4CdrZ5mKuaEsRSbLf2URQ3h6iMtKD=hik8MaF5WwdmC9uZw@mail.gmail.com
* Revise RelationBuildRowSecurity() to avoid memory leaks.Tom Lane2020-09-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | This function leaked some memory while loading qual clauses for an RLS policy. While ordinarily negligible, that could build up in some repeated-reload cases, as reported by Konstantin Knizhnik. We can improve matters by borrowing the coding long used in RelationBuildRuleLock: build stringToNode's result directly in the target context, and remember to explicitly pfree the input string. This patch by no means completely guarantees zero leaks within this function, since we have no real guarantee that the catalog- reading subroutines it calls don't leak anything. However, practical tests suggest that this is enough to resolve the issue. In any case, any remaining leaks are similar to those risked by RelationBuildRuleLock and other relcache-loading subroutines. If we need to fix them, we should adopt a more global approach such as that used by the RECOVER_RELATION_BUILD_MEMORY hack. While here, let's remove the need for an expensive PG_TRY block by using MemoryContextSetParent to reparent an initially-short-lived context for the RLS data. Back-patch to all supported branches. Discussion: https://postgr.es/m/21356c12-8917-8249-b35f-1c447231922b@postgrespro.ru
* Fix handling of -d "connection string" in pg_dump/pg_restore.Tom Lane2020-09-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Parallel pg_dump failed if its -d parameter was a connection string containing any essential information other than host, port, or username. The same was true for pg_restore with --create. The reason is that these scenarios failed to preserve the connection string from the command line; the code felt free to replace that with just the database name when reconnecting from a pg_dump parallel worker or after creating the target database. By chance, parallel pg_restore did not suffer this defect, as long as you didn't say --create. In practice it seems that the error would be obvious only if the connstring included essential, non-default SSL or GSS parameters. This may explain why it took us so long to notice. (It also makes it very difficult to craft a regression test case illustrating the problem, since the test would fail in builds without those options.) Fix by refactoring so that ConnectDatabase always receives all the relevant options directly from the command line, rather than reconstructed values. Inject a different database name, when necessary, by relying on libpq's rules for handling multiple "dbname" parameters. While here, let's get rid of the essentially duplicate _connectDB function, as well as some obsolete nearby cruft. Per bug #16604 from Zsolt Ero. Back-patch to all supported branches. Discussion: https://postgr.es/m/16604-933f4b8791227b15@postgresql.org
* Fix missing fsync of SLRU directories.Thomas Munro2020-09-24
| | | | | | | | | | | | | Harmonize behavior by moving reponsibility for fsyncing directories down into slru.c. In 10 and later, only the multixact directories were missed (see commit 1b02be21), and in older branches all SLRUs were missed. Back-patch to all supported releases. Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CA%2BhUKGLtsTUOScnNoSMZ-2ZLv%2BwGh01J6kAo_DM8mTRq1sKdSQ%40mail.gmail.com
* Avoid possible dangling-pointer access in tsearch_readline_callback.Tom Lane2020-09-23
| | | | | | | | | | | | | | | | | | | | | | | | | | tsearch_readline() saves the string pointer it returns to the caller for possible use in the associated error context callback. However, the caller will usually pfree that string sometime before it next calls tsearch_readline(), so that there is a window where an ereport will try to print an already-freed string. The built-in users of tsearch_readline() happen to all do that pfree at the bottoms of their loops, so that the window is effectively empty for them. However, this is not documented as a requirement, and contrib/dict_xsyn doesn't do it like that, so it seems likely that third-party dictionaries might have live bugs here. The practical consequences of this seem pretty limited in any case, since production builds wouldn't clobber the freed string immediately, besides which you'd not expect syntax errors in dictionary files being used in production. Still, it's clearly a bug waiting to bite somebody. Fix by pstrdup'ing the string to be saved for the error callback, and then pfree'ing it next time through. It's been like this for a long time, so back-patch to all supported branches. Discussion: https://postgr.es/m/48A4FA71-524E-41B9-953A-FD04EF36E2E7@yesql.se
* Fix whitespacePeter Eisentraut2020-09-20
|
* Use factorial rather than numeric_fac in create_operator.sql.Tom Lane2020-09-18
| | | | | | | | | | | | These two SQL functions are aliases for the same C function, so this change has no semantic effect. However, because we dropped the numeric_fac alias in HEAD (commit 76f412ab3), operator definitions based on that one don't port forward, causing problems for cross-version upgrade tests based on the regression database. Patch all active back branches to dodge the problem. Discussion: https://postgr.es/m/449144.1600439950@sss.pgh.pa.us
* Update parallel BTree scan state when the scan keys can't be satisfied.Amit Kapila2020-09-17
| | | | | | | | | | | | | | For parallel btree scan to work for array of scan keys, it should reach BTPARALLEL_DONE state once for every distinct combination of array keys. This is required to ensure that the parallel workers don't try to seize blocks at the same time for different scan keys. We missed to update this state when we discovered that the scan keys can't be satisfied. Author: James Hunter Reviewed-by: Amit Kapila Tested-by: Justin Pryzby Backpatch-through: 10, where it was introduced Discussion: https://postgr.es/m/4248CABC-25E3-4809-B4D0-128E1BAABC3C@amazon.com
* Avoid unnecessary recursion to child tables in ALTER TABLE SET NOT NULL.Tom Lane2020-09-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a partitioned table's column is already marked NOT NULL, there is no need to examine its partitions, because we can rely on previous DDL to have enforced that the child columns are NOT NULL as well. (Unfortunately, the same cannot be said for traditional inheritance, so for now we have to restrict the optimization to partitioned tables.) Hence, we may skip recursing to child tables in this situation. The reason this case is worth worrying about is that when pg_dump dumps a partitioned table having a primary key, it will include the requisite NOT NULL markings in the CREATE TABLE commands, and then add the primary key as a separate step. The primary key addition generates a SET NOT NULL as a subcommand, just to be sure. So the situation where a SET NOT NULL is redundant does arise in the real world. Skipping the recursion does more than just save a few cycles: it means that a command such as "ALTER TABLE ONLY partition_parent ADD PRIMARY KEY" will take locks only on the partition parent table, not on the partitions. It turns out that parallel pg_restore is effectively assuming that that's true, and has little choice but to do so because the dependencies listed for such a TOC entry don't include the partitions. pg_restore could thus issue this ALTER while data restores on the partitions are still in progress. Taking unnecessary locks on the partitions not only hurts concurrency, but can lead to actual deadlock failures, as reported by Domagoj Smoljanovic. (A contributing factor in the deadlock is that TRUNCATE on a child partition wants a non-exclusive lock on the parent. This seems likewise unnecessary, but the fix for it is more invasive so we won't consider back-patching it. Fortunately, getting rid of one of these two poor behaviors is enough to remove the deadlock.) Although support for partitioned primary keys came in with v11, this patch is dependent on the SET NOT NULL refactoring done by commit f4a3fdfbd, so we can only patch back to v12. Patch by me; thanks to Alvaro Herrera and Amit Langote for review. Discussion: https://postgr.es/m/VI1PR03MB31670CA1BD9625C3A8C5DD05EB230@VI1PR03MB3167.eurprd03.prod.outlook.com
* Fix bogus cache-invalidation logic in logical replication worker.Tom Lane2020-09-16
| | | | | | | | | | | | | | | | | | | | | | | | | | The code recorded cache invalidation events by zeroing the "localreloid" field of affected cache entries. However, it's possible for an inval event to occur even while we have the entry open and locked. So an ill-timed inval could result in "cache lookup failed for relation 0" errors, if the worker's code tried to use the cleared field. We can fix that by creating a separate bool field to record whether the entry needs to be revalidated. (In the back branches, cram the bool into what had been padding space, to avoid an ABI break in the somewhat unlikely event that any extension is looking at this struct.) Also, rearrange the logic in logicalrep_rel_open so that it does the right thing in cases where table_open would fail. We should retry the lookup by name in that case, but we didn't. The real-world impact of this is probably small. In the first place, the error conditions are very low probability, and in the second place, the worker would just exit and get restarted. We only noticed because in a CLOBBER_CACHE_ALWAYS build, the failure can occur repeatedly, preventing the worker from making progress. Nonetheless, it's clearly a bug, and it impedes a useful type of testing; so back-patch to v10 where this code was introduced. Discussion: https://postgr.es/m/1032727.1600096803@sss.pgh.pa.us
* Fix interpolation in test name.Noah Misch2020-09-13
| | | | | | | A pre-commit review had reported the problem, but the fix reached only v10 and earlier. Back-patch to v11. Discussion: https://postgr.es/m/20200423.140546.1055476118690602079.horikyota.ntt@gmail.com
* Use the properly transformed RangeVar for expandTableLikeClause().Tom Lane2020-09-13
| | | | | | | | | | | | | | | transformCreateStmt() adjusts the transformed statement's RangeVar to specify the target schema explicitly, for the express reason of making sure that auxiliary statements derived by parse transformation operate on the right table. But the refactoring I did in commit 502898192 got this wrong and passed the untransformed RangeVar to expandTableLikeClause(). This could lead to assertion failures or weird misbehavior if the wrong table was accessed. Per report from Alexander Lakhin. Like the previous patch, back-patch to all supported branches. Discussion: https://postgr.es/m/05051f9d-b32b-cb35-6735-0e9f2ab86b5f@gmail.com
* doc: Don't hide the "Up" link when it is the same as "Home"Peter Eisentraut2020-09-12
| | | | | | | | The original stylesheets seemed to think this was a good idea, but our users find it confusing and unhelpful, so undo that logic. Reported-by: Fabien COELHO <coelho@cri.ensmp.fr> Discussion: https://www.postgresql.org/message-id/flat/alpine.DEB.2.22.394.2006210914370.859381%40pseudo