aboutsummaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAge
...
* Update time zone data files to tzdata release 2020d.Tom Lane2020-10-22
| | | | | DST law changes in Palestine, with a whopping 120 hours' notice. Also some historical corrections for Palestine.
* Sync our copy of the timezone library with IANA release tzcode2020d.Tom Lane2020-10-22
| | | | | | | | There's no functional change at all here, but I'm curious to see whether this change successfully shuts up Coverity's warning about a useless strcmp(), which appeared with the previous update. Discussion: http://mm.icann.org/pipermail/tz/2020-October/029370.html
* Fix connection string handling in psql's \connect command.Tom Lane2020-10-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | psql's \connect claims to be able to re-use previous connection parameters, but in fact it only re-uses the database name, user name, host name (and possibly hostaddr, depending on version), and port. This is problematic for assorted use cases. Notably, pg_dump[all] emits "\connect databasename" commands which we would like to have re-use all other parameters. If such a script is loaded in a psql run that initially had "-d connstring" with some non-default parameters, those other parameters would be lost, potentially causing connection failure. (Thus, this is the same kind of bug addressed in commits a45bc8a4f and 8e5793ab6, although the details are much different.) To fix, redesign do_connect() so that it pulls out all properties of the old PGconn using PQconninfo(), and then replaces individual properties in that array. In the case where we don't wish to re-use anything, get libpq's default settings using PQconndefaults() and replace entries in that, so that we don't need different code paths for the two cases. This does result in an additional behavioral change for cases where the original connection parameters allowed multiple hosts, say "psql -h host1,host2", and the \connect request allows re-use of the host setting. Because the previous coding relied on PQhost(), it would only permit reconnection to the same host originally selected. Although one can think of scenarios where that's a good thing, there are others where it is not. Moreover, that behavior doesn't seem to meet the principle of least surprise, nor was it documented; nor is it even clear it was intended, since that coding long pre-dates the addition of multi-host support to libpq. Hence, this patch is content to drop it and re-use the host list as given. Per Peter Eisentraut's comments on bug #16604. Back-patch to all supported branches. Discussion: https://postgr.es/m/16604-933f4b8791227b15@postgresql.org
* Use fast checkpoint in PostgresNode::backup()Alvaro Herrera2020-10-21
| | | | Should cause tests to be a bit faster
* Fix ALTER TABLE .. ENABLE/DISABLE TRIGGER recursionAlvaro Herrera2020-10-20
| | | | | | | | | | | | | | | | More precisely, correctly handle the ONLY flag indicating not to recurse. This was implemented in 86f575948c77 by recursing in trigger.c, but that's the wrong place; use ATSimpleRecursion instead, which behaves properly. However, because legacy inheritance has never recursed in that situation, make sure to do that only for new-style partitioning. I noticed this problem while testing a fix for another bug in the vicinity. This has been wrong all along, so backpatch to 11. Discussion: https://postgr.es/m/20201016235925.GA29829@alvherre.pgsql
* Avoid invalid alloc size error in shm_mqPeter Eisentraut2020-10-20
| | | | | | | | | | | | In shm_mq_receive(), a huge payload could trigger an unjustified "invalid memory alloc request size" error due to the way the buffer size is increased. Add error checks (documenting the upper limit) and avoid the error by limiting the allocation size to MaxAllocSize. Author: Markus Wanner <markus.wanner@2ndquadrant.com> Discussion: https://www.postgresql.org/message-id/flat/3bb363e7-ac04-0ac4-9fe8-db1148755bfa%402ndquadrant.com
* Fix connection string handling in src/bin/scripts/ programs.Tom Lane2020-10-19
| | | | | | | | | | | | | | | | | | | | | When told to process all databases, clusterdb, reindexdb, and vacuumdb would reconnect by replacing their --maintenance-db parameter with the name of the target database. If that parameter is a connstring (which has been allowed for a long time, though we failed to document that before this patch), we'd lose any other options it might specify, for example SSL or GSS parameters, possibly resulting in failure to connect. Thus, this is the same bug as commit a45bc8a4f fixed in pg_dump and pg_restore. We can fix it in the same way, by using libpq's rules for handling multiple "dbname" parameters to add the target database name separately. I chose to apply the same refactoring approach as in that patch, with a struct to handle the command line parameters that need to be passed through to connectDatabase. (Maybe someday we can unify the very similar functions here and in pg_dump/pg_restore.) Per Peter Eisentraut's comments on bug #16604. Back-patch to all supported branches. Discussion: https://postgr.es/m/16604-933f4b8791227b15@postgresql.org
* Fix list-munging bug that broke SQL function result coercions.Tom Lane2020-10-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 913bbd88d, check_sql_fn_retval() can either insert type coercion steps in-line in the Query that produces the SQL function's results, or generate a new top-level Query to perform the coercions, if modifying the Query's output in-place wouldn't be safe. However, it appears that the latter case has never actually worked, because the code tried to inject the new Query back into the query list it was passed ... which is not the list that will be used for later processing when we execute the SQL function "normally" (without inlining it). So we ended up with no coercion happening at run-time, leading to wrong results or crashes depending on the datatypes involved. While the regression tests look like they cover this area well enough, through a huge bit of bad luck all the test cases that exercise the separate-Query path were checking either inline-able cases (which accidentally didn't have the bug) or cases that are no-ops at runtime (e.g., varchar to text), so that the failure to perform the coercion wasn't obvious. The fact that the cases that don't work weren't allowed at all before v13 probably contributed to not noticing the problem sooner, too. To fix, get rid of the separate "flat" list of Query nodes and instead pass the real two-level list that is going to be used later. I chose to make the same change in check_sql_fn_statements(), although that has no actual bug, just so that we don't need that data structure at all. This is an API change, as evidenced by the adjustments needed to callers outside functions.c. That's a bit scary to be doing in a released branch, but so far as I can tell from a quick search, there are no outside callers of these functions (and they are sufficiently specific to our semantics for SQL-language functions that it's not apparent why any extension would need to call them). In any case, v13 already changed the API of check_sql_fn_retval() compared to prior branches. Per report from pinker. Back-patch to v13 where this code came in. Discussion: https://postgr.es/m/1603050466566-0.post@n3.nabble.com
* In libpq for Windows, call WSAStartup once and WSACleanup not at all.Tom Lane2020-10-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Windows documentation insists that every WSAStartup call should have a matching WSACleanup call. However, if that ever had actual relevance, it wasn't in this century. Every remotely-modern Windows kernel is capable of cleaning up when a process exits without doing that, and must be so to avoid resource leaks in case of a process crash. Moreover, Postgres backends have done WSAStartup without WSACleanup since commit 4cdf51e64 in 2004, and we've never seen any indication of a problem with that. libpq's habit of doing WSAStartup during connection start and WSACleanup during shutdown is also rather inefficient, since a series of non-overlapping connection requests leads to repeated, quite expensive DLL unload/reload cycles. We document a workaround for that (having the application call WSAStartup for itself), but that's just a kluge. It's also worth noting that it's far from uncommon for applications to exit without doing PQfinish, and we've not heard reports of trouble from that either. However, the real reason for acting on this is that recent experiments by Alexander Lakhin show that calling WSACleanup during PQfinish is triggering the symptom we occasionally see that a process using libpq fails to emit expected stdio output. Therefore, let's change libpq so that it calls WSAStartup only once per process, during the first connection attempt, and never calls WSACleanup at all. While at it, get rid of the only other WSACleanup call in our code tree, in pg_dump/parallel.c; that presumably is equally useless. Back-patch of HEAD commit 7d00a6b2d. Discussion: https://postgr.es/m/ac976d8c-03df-d6b8-025c-15a2de8d9af1@postgrespro.ru
* Relax some asserts in merge join costing codeDavid Rowley2020-10-20
| | | | | | | | | | | | | | | | | | | | | | | | In the planner, it was possible, given an extreme enough case containing a large number of joins for the number of estimated rows to become infinite. This could cause problems in initial_cost_mergejoin() where we perform some calculations based on those row estimates. A problem case, presented by Onder Kalaci showed an Assert failure from an Assert checking outerstartsel <= outerendsel. In his test case this was effectively NaN <= Inf, which is false. The NaN outerstartsel came from multiplying the infinite outer_path_rows by 0.0. In master, this problem was fixed by a90c950fc, however, that fix was too invasive for the backbranches. Here we just relax the Asserts to allow them to pass. The worst that appears to happen from this is that we show NaN cost values and infinite row estimates in EXPLAIN. add_path() would have had a hard time doing anything useful with such costs, but that does not really matter as if the row estimates were even close to accurate, such plan would not complete this side of the heat death of the universe. Reported-by: Onder Kalaci Backpatch: 9.5 to 13 Discussion: https://postgr.es/m/DM6PR21MB1211FF360183BCA901B27F04D80B0@DM6PR21MB1211.namprd21.prod.outlook.com
* Update time zone data files to tzdata release 2020c.Tom Lane2020-10-16
| | | | | | DST law changes in Morocco, Canadian Yukon, Fiji, Macquarie Island, Casey Station (Antarctica). Historical corrections for France, Hungary, Monaco.
* Sync our copy of the timezone library with IANA release tzcode2020c.Tom Lane2020-10-16
| | | | | | | | | | | | | | | This changes zic's default output format from "-b fat" to "-b slim". We were already using "slim" in v13/HEAD, so those branches drop the explicit -b switch in the Makefiles. Instead, add an explicit "-b fat" in v12 and before, so that we don't change the output file format in those branches. (This is perhaps excessively conservative, but we decided not to do so in a12079109, and I'll stick with that.) Other non-cosmetic changes are to drop support for zic's long-obsolete "-y" switch, and to ensure that strftime() does not change errno unless it fails. As usual with tzcode changes, back-patch to all supported branches.
* llvmjit: Work around bug in LLVM 3.9 causing crashes after 72559438f92.Andres Freund2020-10-15
| | | | | | | | | | | | | | Unfortunately in LLVM 3.9 LLVMGetAttributeCountAtIndex(func, index) crashes when called with an index that has 0 attributes. Since there's no way to work around this in the C API, add a small C++ wrapper doing so. The only reason this didn't fail before 72559438f92 is that there always are function attributes... Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20201016001254.w2nfj7gd74jmb5in@alap3.anarazel.de Backpatch: 11-, like 72559438f92
* pg_upgrade: remove C99 compiler req. from commit 3c0471b5fdBruce Momjian2020-10-15
| | | | | | | | | | | | | | This commit required support for inline variable definition, which is not a requirement. RELEASE NOTE AUTHOR: the author of commit 3c0471b5fd (pg_upgrade/tablespaces) was Justin Pryzby, not me. Reported-by: Andres Freund Discussion: https://postgr.es/m/20201016001959.h24fkywfubkv2pc5@alap3.anarazel.de Backpatch-through: 9.5
* pg_upgrade: generate check error for left-over new tablespaceBruce Momjian2020-10-15
| | | | | | | | | | | | | Previously, if pg_upgrade failed, and the user recreated the cluster but did not remove the new cluster tablespace directory, a later pg_upgrade would fail since the new tablespace directory would already exists. This adds error reporting for this during check. Reported-by: Justin Pryzby Discussion: https://postgr.es/m/20200925005531.GJ23631@telsasoft.com Backpatch-through: 9.5
* llvmjit: Also copy parameter / return value attributes from template functions.Andres Freund2020-10-15
| | | | | | | | | | | | | | | | | | | | | Previously we only copied the function attributes. That caused problems at least on s390x: Because we didn't copy the 'zeroext' attribute for ExecAggTransReparent()'s *IsNull parameters, expressions invoking it didn't ensure that the upper bytes of the registers were zeroed. In the - relatively rare - cases where not, ExecAggTransReparent() wrongly ended up in the newValueIsNull branch due to the register not being zero. Subsequently causing a crash. It's quite possible that this would cause problems on other platforms, and in other places than just ExecAggTransReparent() on s390x. Thanks to Christoph (and the Debian project) for providing me with access to a s390x machine, allowing me to debug this. Reported-By: Christoph Berg Author: Andres Freund Discussion: https://postgr.es/m/20201015083246.kie5726xerdt3ael@alap3.anarazel.de Backpatch: 11-, where JIT was added
* doc: improve description of synchronous_commit modesBruce Momjian2020-10-15
| | | | | | | | | | | | Previously it wasn't clear exactly what each of the synchronous_commit modes accomplished. This clarifies that, and adds a table describing it. Only backpatched through 9.6 since 9.5 doesn't have all the options. Reported-by: kghost0@gmail.com Discussion: https://postgr.es/m/159741195522.14321.13812604195366728976@wrigleys.postgresql.org Backpatch-through: 9.6
* Fix query in new test to check tables are syncedAlvaro Herrera2020-10-15
| | | | | | | | Rather than looking for tablesync workers, it is more reliable to see the sync state of the tables. Per note from Amit Kapila. Discussion: https://postgr.es/m/CAA4eK1JSSD7FVwq+_rOme86jUZTQFzjsNU06hQ4-LiRt1xFmSg@mail.gmail.com
* Handle EACCES errors from kevent() better.Thomas Munro2020-10-15
| | | | | | | | | | | | | | | | While registering for postmaster exit events, we have to handle a couple of edge cases where the postmaster is already gone. Commit 815c2f09 missed one: EACCES must surely imply that PostmasterPid no longer belongs to our postmaster process (or alternatively an unexpected permissions model has been imposed on us). Like ESRCH, this should be treated as a WL_POSTMASTER_DEATH event, rather than being raised with ereport(). No known problems reported in the wild. Per code review from Tom Lane. Back-patch to 13. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/3624029.1602701929%40sss.pgh.pa.us
* Restore replication protocol's duplicate command tagsAlvaro Herrera2020-10-14
| | | | | | | | | | | | | | | | | | | | I removed the duplicate command tags for START_REPLICATION inadvertently in commit 07082b08cc5d, but the replication protocol requires them. The fact that the replication protocol was broken was not noticed because all our test cases use an optimized code path that exits early, failing to verify that the behavior is correct for non-optimized cases. Put them back. Also document this protocol quirk. Add a test case that shows the failure. It might still succeed even without the patch when run on a fast enough server, but it suffices to show the bug in enough cases that it would be noticed in buildfarm. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reported-by: Henry Hinze <henry.hinze@gmail.com> Reviewed-by: Petr Jelínek <petr.jelinek@2ndquadrant.com> Discussion: https://postgr.es/m/16643-eaadeb2a1a58d28c@postgresql.org
* Make WL_POSTMASTER_DEATH level-triggered on kqueue builds.Thomas Munro2020-10-15
| | | | | | | | | | | | If WaitEventSetWait() reports that the postmaster has gone away, later calls to WaitEventSetWait() should continue to report that. Otherwise further waits that occur in the proc_exit() path after we already noticed the postmaster's demise could block forever. Back-patch to 13, where the kqueue support landed. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/3624029.1602701929%40sss.pgh.pa.us
* Paper over regression failures in infinite_recurse() on PPC64 Linux.Tom Lane2020-10-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our infinite_recurse() test to verify sane stack-overrun behavior is affected by a bug of the Linux kernel on PPC64: it will get SIGSEGV if it receives a signal when the stack depth is (a) over 1MB and (b) within a few kB of filling the current physical stack allocation. See https://bugzilla.kernel.org/show_bug.cgi?id=205183. Since this test is a bit time-consuming and we run it in parallel with test scripts that do a lot of DDL, it can be expected to get an sinval catchup interrupt at some point, leading to failure if the timing is wrong. This has caused more than 100 buildfarm failures over the past year or so. While a fix exists for the kernel bug, it might be years before that propagates into all production kernels, particularly in some of the older distros we have in the buildfarm. For now, let's just back off and not run this test on Linux PPC64; that loses nothing in test coverage so far as our own code is concerned. To do that, split this test into a new script infinite_recurse.sql and skip the test when the platform name is powerpc64...-linux-gnu. Back-patch to v12. Branches before that have not been seen to get this failure. No doubt that's because the "errors" test was not run in parallel with other tests before commit 798070ec0, greatly reducing the odds of an sinval catchup being necessary. I also back-patched 3c8553547 into v12, just so the new regression script would look the same in all branches having it. Discussion: https://postgr.es/m/3479046.1602607848@sss.pgh.pa.us Discussion: https://postgr.es/m/20190723162703.GM22387%40telsasoft.com
* Fix GiST buffering build to work when there are included columns.Tom Lane2020-10-12
| | | | | | | | | | | | | | gistRelocateBuildBuffersOnSplit did not get the memo about which attribute count to use. This could lead to a crash if there were included columns and buffering build was chosen. (Because there are random page-split decisions elsewhere in GiST index build, the crashes are not entirely deterministic.) Back-patch to v12 where GiST gained support for included columns. Pavel Borisov Discussion: https://postgr.es/m/CALT9ZEECCV5m7wvxg46PC-7x-EybUmnpupBGhSFMoAAay+r6HQ@mail.gmail.com
* Fix memory leak when guc.c decides a setting can't be applied now.Tom Lane2020-10-12
| | | | | | | | | | | | | | | | | | | | The prohibitValueChange code paths in set_config_option(), which are executed whenever we re-read a PGC_POSTMASTER variable from postgresql.conf, neglected to free anything before exiting. Thus we'd leak the proposed new value of a PGC_STRING variable, as noted by BoChen in bug #16666. For all variable types, if the check hook creates an "extra" chunk, we'd also leak that. These are malloc not palloc chunks, so there is no mechanism for recovering the leaks before process exit. Fortunately, the values are typically not very large, meaning you'd have to go through an awful lot of SIGHUP configuration-reload cycles to make the leakage amount to anything. Still, for a long-lived postmaster process it could potentially be a problem. Oversight in commit 2594cf0e8. Back-patch to all supported branches. Discussion: https://postgr.es/m/16666-2c41a4eec61b03e1@postgresql.org
* Choose ppc compare_exchange constant path for more operand values.Noah Misch2020-10-11
| | | | | | | | | | | | | The implementation uses smaller code when the "expected" operand is a small constant, but the implementation needlessly defined the set of acceptable constants more narrowly than the ABI does. Core PostgreSQL and PGXN don't use the constant path at all, so this is future-proofing. Back-patch to v13, where commit 30ee5d17c20dbb282a9952b3048d6ad52d56c371 introduced this code. Reviewed by Tom Lane. Reported by Christoph Berg. Discussion: https://postgr.es/m/20201009092825.GD889580@msg.df7cb.de
* For ppc gcc, implement 64-bit compare_exchange and fetch_add with asm.Noah Misch2020-10-11
| | | | | | | | | | | While xlc defines __64BIT__, gcc does not. Due to this oversight in commit 30ee5d17c20dbb282a9952b3048d6ad52d56c371, gcc builds continued implementing 64-bit atomics by way of intrinsics. Back-patch to v13, where that commit first appeared. Reviewed by Tom Lane. Discussion: https://postgr.es/m/20201011051043.GA1724101@rfd.leadboat.com
* Fix optimization hazard in gram.y's makeOrderedSetArgs(), redux.Tom Lane2020-10-07
| | | | | | | | | | | | | | | | It appears that commit cf63c641c, which intended to prevent misoptimization of the result-building step in makeOrderedSetArgs, didn't go far enough: buildfarm member hornet's version of xlc is now optimizing back to the old, broken behavior in which list_length(directargs) is fetched only after list_concat() has changed that value. I'm not entirely convinced whether that's an undeniable compiler bug or whether it can be justified by a sufficiently aggressive interpretation of C sequence points. So let's just change the code to make it harder to misinterpret. Back-patch to all supported versions, just in case. Discussion: https://postgr.es/m/1830491.1601944935@sss.pgh.pa.us
* Prevent internal overflows in date-vs-timestamp and related comparisons.Tom Lane2020-10-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The date-vs-timestamp, date-vs-timestamptz, and timestamp-vs-timestamptz comparators all worked by promoting the first type to the second and then doing a simple same-type comparison. This works fine, except when the conversion result is out of range, in which case we throw an entirely avoidable error. The sources of such failures are (a) type date can represent dates much farther in the future than the timestamp types can; (b) timezone rotation might cause a just-in-range timestamp value to become a just-out-of-range timestamptz value. Up to now we just ignored these corner-case issues, but now we have an actual user complaint (bug #16657 from Huss EL-Sheikh), so let's do something about it. It turns out that commit 52ad1e659 already built all the necessary infrastructure to support error-free comparisons, but neglected to actually use it in the main-line code paths. Fix that, do a little bit of code style review, and remove the now-duplicate logic in jsonpath_exec.c. Back-patch to v13 where 52ad1e659 came in. We could take this back further by back-patching said infrastructure, but given the small number of complaints so far, I don't feel a great need to. Discussion: https://postgr.es/m/16657-cde2f876d8cc7971@postgresql.org
* Rethink recent fix for pg_dump's handling of extension config tables.Tom Lane2020-10-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 3eb3d3e78 was a few bricks shy of a load: while it correctly set the table's "interesting" flag when deciding to dump the data of an extension config table, it was not correct to clear that flag if we concluded we shouldn't dump the data. This led to the crash reported in bug #16655, because in fact we'll traverse dumpTableSchema anyway for all extension tables (to see if they have user-added seclabels or RLS policies). The right thing to do is to force "interesting" true in makeTableDataInfo, and otherwise leave the flag alone. (Doing it there is more future-proof in case additional calls are added, and it also avoids setting the flag unnecessarily if that function decides the table is non-dumpable.) This investigation also showed that while only the --inserts code path had an obvious failure in the case considered by 3eb3d3e78, the COPY code path also has a problem with not having loaded table subsidiary data. That causes fmtCopyColumnList to silently return an empty string instead of the correct column list. That accidentally mostly works, which perhaps is why we didn't notice this before. It would only fail if the restore column order is different from the dump column order, which only happens in weird inheritance cases, so it's not surprising nobody had hit the case with an extension config table. Nonetheless, it's a bug, and it goes a long way back, not just to v12 where the --inserts code path started to have a problem with this. In hopes of catching such cases a bit sooner in future, add some Asserts that "interesting" has been set in both dumpTableData and dumpTableSchema. Adjust the test case added by 3eb3d3e78 so that it checks the COPY rather than INSERT form of that bug, allowing it to detect the longer-standing symptom. Per bug #16655 from Cameron Daniel. Back-patch to all supported branches. Discussion: https://postgr.es/m/16655-5c92d6b3a9438137@postgresql.org Discussion: https://postgr.es/m/18048b44-3414-b983-8c7c-9165b177900d@2ndQuadrant.com
* pg_upgrade: remove pre-8.4 code and >= 8.4 checkBruce Momjian2020-10-06
| | | | | | | | | | We only support upgrading from >= 8.4 so no need for this code or tests. Reported-by: Magnus Hagander Discussion: https://postgr.es/m/CABUevEx-D0PNVe00tkeQRGennZQwDtBJn=493MJt-x6sppbUxA@mail.gmail.com Backpatch-through: 9.5
* pg_upgrade; change major version comparisons to use <=, not <Bruce Momjian2020-10-06
| | | | | | This makes checking for older major versions more consistent. Backpatch-through: 9.5
* Build EC members for child join rels in the right memory context.Tom Lane2020-10-06
| | | | | | | | | | | | | | | | | This patch prevents crashes or wrong plans when partition-wise joins are considered during GEQO planning, as a consequence of the EquivalenceClass data structures becoming corrupt after a GEQO context reset. A remaining problem is that successive GEQO cycles will make multiple copies of the required EC members, since add_child_join_rel_equivalences has no idea that such members might exist already. For now we'll just live with that. The lack of field complaints of crashes suggests that this is a mighty little-used situation. Back-patch to v12 where this code was introduced. Discussion: https://postgr.es/m/1683100.1601860653@sss.pgh.pa.us
* Fix two latent(?) bugs in equivclass.c.Tom Lane2020-10-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | get_eclass_for_sort_expr() computes expr_relids and nullable_relids early on, even though they won't be needed unless we make a new EquivalenceClass, which we often don't. Aside from the probably-minor inefficiency, there's a memory management problem: these bitmapsets will be built in the caller's context, leading to dangling pointers if that is shorter-lived than root->planner_cxt. This would be a live bug if get_eclass_for_sort_expr() could be called with create_it = true during GEQO join planning. So far as I can find, the core code never does that, but it's hard to be sure that no extensions do, especially since the comments make it clear that that's supposed to be a supported case. Fix by not computing these values until we've switched into planner_cxt to build the new EquivalenceClass. generate_join_implied_equalities() uses inner_rel->relids to look up relevant eclasses, but it ought to be using nominal_inner_relids. This is presently harmless because a child RelOptInfo will always have exactly the same eclass_indexes as its topmost parent; but that might not be true forever, and anyway it makes the code confusing. The first of these is old (introduced by me in f3b3b8d5b), so back-patch to all supported branches. The second only dates to v13, but we might as well back-patch it to keep the code looking similar across branches. Discussion: https://postgr.es/m/1508010.1601832581@sss.pgh.pa.us
* Improve stability of identity.sql regression test.Tom Lane2020-10-04
| | | | | | | | | | | | | | I noticed while trying to run the regression tests under a low geqo_threshold that one query on information_schema.columns had unstable (as in, variable from one run to the next) output order. This is pretty unsurprising given the complexity of the underlying plan. Interestingly, of this test's three nigh-identical queries on information_schema.columns, the other two already had ORDER BY clauses guaranteeing stable output. Let's make this one look the same. Back-patch to v10 where this test was added. We've not heard field reports of the test failing, but this experience shows that it can happen when testing under even slightly unusual conditions.
* Put back explicit setting of replication values within TAP tests.Tom Lane2020-10-01
| | | | | | | | | | | | | | Commit 151c0c5f7 neglected the possibility that a TEMP_CONFIG file would explicitly set max_wal_senders=0; as indeed buildfarm member thorntail does, so that it can test wal_level=minimal in other test suites. Hence, rather than assuming that max_wal_senders=10 will prevail if we say nothing, set it explicitly. Set max_replication_slots=10 explicitly too, just to be safe. Back-patch to v10, like the previous patch. Discussion: https://postgr.es/m/723911.1601417626@sss.pgh.pa.us
* Fix incorrect assertion on number of array dimensions.Heikki Linnakangas2020-10-01
| | | | | | | | | This has been wrong ever since the support for multi-dimensional arrays as PL/python function arguments and return values was introduced in commit 94aceed317. Backpatch-through: 10 Discussion: https://www.postgresql.org/message-id/61647b8e-961c-0362-d5d3-c8a18f4a7ec6%40iki.fi
* Reword partitioning error messageAlvaro Herrera2020-09-30
| | | | | | | | | | The error message about columns in the primary key not including all of the partition key was unclear; reword it. Backpatch all the way to pg11, where it appeared. Reported-by: Nagaraj Raj <nagaraj.sf@yahoo.com> Discussion: https://postgr.es/m/64062533.78364.1601415362244@mail.yahoo.com
* Fix handling of BC years in to_date/to_timestamp.Tom Lane2020-09-30
| | | | | | | | | | | | | | | | | | | | | | | Previously, a conversion such as to_date('-44-02-01','YYYY-MM-DD') would result in '0045-02-01 BC', as the code attempted to interpret the negative year as BC, but failed to apply the correction needed for our internal handling of BC years. Fix the off-by-one problem. Also, arrange for the combination of a negative year and an explicit "BC" marker to cancel out and produce AD. This is how the negative-century case works, so it seems sane to do likewise. Continue to read "year 0000" as 1 BC. Oracle would throw an error, but we've accepted that case for a long time so I'm hesitant to change it in a back-patch. Per bug #16419 from Saeed Hubaishan. Back-patch to all supported branches. Dar Alathar-Yemen and Tom Lane Discussion: https://postgr.es/m/16419-d8d9db0a7553f01b@postgresql.org
* Remove obsolete replication settings within TAP tests.Tom Lane2020-09-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PostgresNode.pm set "max_wal_senders = 5" for replication testing, but this seems to be slightly too low for our current test suite. Slower buildfarm members frequently report "number of requested standby connections exceeds max_wal_senders" failures, due to old walsenders not exiting instantaneously. Usually, the test does not fail overall because of automatic walreceiver restart, but sometimes the failure becomes visible; and in any case such retries slow down the test. That value came in with commit 89ac7004d, but was soon obsoleted by f6d6d2920, which raised the built-in default from zero to 10; so that PostgresNode.pm is actually setting it to less than the conservative built-in default. That seems pretty pointless, so let's remove the special setting and let the default prevail, in hopes of making the TAP tests more robust. Likewise, the setting "max_replication_slots = 5" is obsolete and can be removed. While here, reverse-engineer a comment about why we're choosing less-than-default values for some other settings. (Note: before v12, max_wal_senders counted against max_connections so that the latter setting also needs some fiddling with.) Back-patch to v10 where the subscription tests were added. It's likely that the older branches aren't pushing the boundaries of max_wal_senders, but I'm disinclined to spend time trying to figure out exactly when it started to be a problem. Discussion: https://postgr.es/m/723911.1601417626@sss.pgh.pa.us
* Fix memory leak in plpgsql's CALL processing.Tom Lane2020-09-29
| | | | | | | | | | | | | | | | | | | | | When executing a CALL or DO in a non-atomic context (i.e., not inside a function or query), plpgsql creates a new plan each time through, as a rather hacky solution to some resource management issues. But it failed to free this plan until exit of the current procedure or DO block, resulting in serious memory bloat in procedures that called other procedures many times. Fix by remembering to free the plan, and by being more honest about restoring the previous state (otherwise, recursive procedure calls have a problem). There was also a smaller leak associated with recalculation of the "target" list of output variables. Fix that by using the statement- lifespan context to hold non-permanent values. Back-patch to v11 where procedures were introduced. Pavel Stehule and Tom Lane Discussion: https://postgr.es/m/CAFj8pRDiiU1dqym+_P4_GuTWm76knJu7z9opWayBJTC0nQGUUA@mail.gmail.com
* Support for ISO 8601 in the jsonpath .datetime() methodAlexander Korotkov2020-09-29
| | | | | | | | | | | | | | | | The SQL standard doesn't require jsonpath .datetime() method to support the ISO 8601 format. But our to_json[b]() functions convert timestamps to text in the ISO 8601 format in the sake of compatibility with javascript. So, we add support of the ISO 8601 to the jsonpath .datetime() in the sake compatibility with to_json[b](). The standard mode of datetime parsing currently supports just template patterns and separators in the format string. In order to implement ISO 8601, we have to add support of the format string double quotes to the standard parsing mode. Discussion: https://postgr.es/m/94321be0-cc96-1a81-b6df-796f437f7c66%40postgrespro.ru Author: Nikita Glukhov, revised by me Backpatch-through: 13
* Remove excess space from jsonpath .datetime() default format stringAlexander Korotkov2020-09-29
| | | | | | | | | | | bffe1bd684 has introduced jsonpath .datetime() method, but default formats for time and timestamp contain excess space between time and timezone. This commit removes this excess space making behavior of .datetime() method standard-compliant. Discussion: https://postgr.es/m/94321be0-cc96-1a81-b6df-796f437f7c66%40postgrespro.ru Author: Nikita Glukhov Backpatch-through: 13
* Archive timeline history files in standby if archive_mode is set to "always".Fujii Masao2020-09-29
| | | | | | | | | | | | | | | | | | | | Previously the standby server didn't archive timeline history files streamed from the primary even when archive_mode is set to "always", while it archives the streamed WAL files. This could cause the PITR to fail because there was no required timeline history file in the archive. The cause of this issue was that walreceiver didn't mark those files as ready for archiving. This commit makes walreceiver mark those streamed timeline history files as ready for archiving if archive_mode=always. Then the archiver process archives the marked timeline history files. Back-patch to all supported versions. Reported-by: Grigory Smolkin Author: Grigory Smolkin, Fujii Masao Reviewed-by: David Zhang, Anastasia Lubennikova Discussion: https://postgr.es/m/54b059d4-2b48-13a4-6f43-95a087c92367@postgrespro.ru
* Fix progress reporting of REINDEX CONCURRENTLYMichael Paquier2020-09-29
| | | | | | | | | | | | | | | | This addresses a couple of issues with the so-said subject: - Report the correct parent relation with the index actually being rebuilt or validated. Previously, the command status remained set to the last index created for the progress of the index build and validation, which would be incorrect when working on a table that has more than one index. - Use the correct phase when waiting before the drop of the old indexes. Previously, this was reported with the same status as when waiting before the old indexes are marked as dead. Author: Matthias van de Meent, Michael Paquier Discussion: https://postgr.es/m/CAEze2WhqFgcwe1_tv=sFYhLWV2AdpfukumotJ6JNcAOQs3jufg@mail.gmail.com Backpatch-through: 12
* Add for_each_from, to simplify loops starting from non-first list cells.Tom Lane2020-09-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have a dozen or so places that need to iterate over all but the first cell of a List. Prior to v13 this was typically written as for_each_cell(lc, lnext(list_head(list))) Commit 1cff1b95a changed these to for_each_cell(lc, list, list_second_cell(list)) This patch introduces a new macro for_each_from() which expresses the start point as a list index, allowing these to be written as for_each_from(lc, list, 1) This is marginally more efficient, since ForEachState.i can be initialized directly instead of backing into it from a ListCell address. It also seems clearer and less typo-prone. Some of the remaining uses of for_each_cell() look like they could profitably be changed to for_each_from(), but here I confined myself to changing uses of list_second_cell(). Also, fix for_each_cell_setup() and for_both_cell_setup() to const-ify their arguments; that's a simple oversight in 1cff1b95a. Back-patch into v13, on the grounds that (1) the const-ification is a minor bug fix, and (2) it's better for back-patching purposes if we only have two ways to write these loops rather than three. In HEAD, also remove list_third_cell() and list_fourth_cell(), which were also introduced in 1cff1b95a, and are unused as of cc99baa43. It seems unlikely that any third-party code would have started to use them already; anyone who has can be directed to list_nth_cell instead. Discussion: https://postgr.es/m/CAApHDvpo1zj9KhEpU2cCRZfSM3Q6XGdhzuAS2v79PH7WJBkYVA@mail.gmail.com
* Assign collations in partition bound expressions.Tom Lane2020-09-28
| | | | | | | | | | Failure to do this can result in errors during evaluation of the bound expression, as illustrated by the new regression test. Back-patch to v12 where the ability for partition bounds to be expressions was added. Discussion: https://postgr.es/m/CAJV4CdrZ5mKuaEsRSbLf2URQ3h6iMtKD=hik8MaF5WwdmC9uZw@mail.gmail.com
* Revise RelationBuildRowSecurity() to avoid memory leaks.Tom Lane2020-09-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | This function leaked some memory while loading qual clauses for an RLS policy. While ordinarily negligible, that could build up in some repeated-reload cases, as reported by Konstantin Knizhnik. We can improve matters by borrowing the coding long used in RelationBuildRuleLock: build stringToNode's result directly in the target context, and remember to explicitly pfree the input string. This patch by no means completely guarantees zero leaks within this function, since we have no real guarantee that the catalog- reading subroutines it calls don't leak anything. However, practical tests suggest that this is enough to resolve the issue. In any case, any remaining leaks are similar to those risked by RelationBuildRuleLock and other relcache-loading subroutines. If we need to fix them, we should adopt a more global approach such as that used by the RECOVER_RELATION_BUILD_MEMORY hack. While here, let's remove the need for an expensive PG_TRY block by using MemoryContextSetParent to reparent an initially-short-lived context for the RLS data. Back-patch to all supported branches. Discussion: https://postgr.es/m/21356c12-8917-8249-b35f-1c447231922b@postgrespro.ru
* Fix handling of -d "connection string" in pg_dump/pg_restore.Tom Lane2020-09-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Parallel pg_dump failed if its -d parameter was a connection string containing any essential information other than host, port, or username. The same was true for pg_restore with --create. The reason is that these scenarios failed to preserve the connection string from the command line; the code felt free to replace that with just the database name when reconnecting from a pg_dump parallel worker or after creating the target database. By chance, parallel pg_restore did not suffer this defect, as long as you didn't say --create. In practice it seems that the error would be obvious only if the connstring included essential, non-default SSL or GSS parameters. This may explain why it took us so long to notice. (It also makes it very difficult to craft a regression test case illustrating the problem, since the test would fail in builds without those options.) Fix by refactoring so that ConnectDatabase always receives all the relevant options directly from the command line, rather than reconstructed values. Inject a different database name, when necessary, by relying on libpq's rules for handling multiple "dbname" parameters. While here, let's get rid of the essentially duplicate _connectDB function, as well as some obsolete nearby cruft. Per bug #16604 from Zsolt Ero. Back-patch to all supported branches. Discussion: https://postgr.es/m/16604-933f4b8791227b15@postgresql.org
* Fix missing fsync of SLRU directories.Thomas Munro2020-09-24
| | | | | | | | | | | | | Harmonize behavior by moving reponsibility for fsyncing directories down into slru.c. In 10 and later, only the multixact directories were missed (see commit 1b02be21), and in older branches all SLRUs were missed. Back-patch to all supported releases. Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CA%2BhUKGLtsTUOScnNoSMZ-2ZLv%2BwGh01J6kAo_DM8mTRq1sKdSQ%40mail.gmail.com
* Avoid possible dangling-pointer access in tsearch_readline_callback.Tom Lane2020-09-23
| | | | | | | | | | | | | | | | | | | | | | | | | | tsearch_readline() saves the string pointer it returns to the caller for possible use in the associated error context callback. However, the caller will usually pfree that string sometime before it next calls tsearch_readline(), so that there is a window where an ereport will try to print an already-freed string. The built-in users of tsearch_readline() happen to all do that pfree at the bottoms of their loops, so that the window is effectively empty for them. However, this is not documented as a requirement, and contrib/dict_xsyn doesn't do it like that, so it seems likely that third-party dictionaries might have live bugs here. The practical consequences of this seem pretty limited in any case, since production builds wouldn't clobber the freed string immediately, besides which you'd not expect syntax errors in dictionary files being used in production. Still, it's clearly a bug waiting to bite somebody. Fix by pstrdup'ing the string to be saved for the error callback, and then pfree'ing it next time through. It's been like this for a long time, so back-patch to all supported branches. Discussion: https://postgr.es/m/48A4FA71-524E-41B9-953A-FD04EF36E2E7@yesql.se