aboutsummaryrefslogtreecommitdiff
path: root/src/backend
Commit message (Collapse)AuthorAge
...
* Fix unportable use of getnameinfo() in pg_hba_file_rules view.Tom Lane2020-11-02
| | | | | | | | | | | | | | | | | | | | | | | fill_hba_line() thought it could get away with passing sizeof(struct sockaddr_storage) rather than the actual addrlen previously returned by getaddrinfo(). While that appears to work on many platforms, it does not work on FreeBSD 11: you get back a failure, which leads to the view showing NULL for the address and netmask columns in all rows. The POSIX spec for getnameinfo() is pretty clearly on FreeBSD's side here: you should pass the actual address length. So it seems plausible that there are other platforms where this coding also fails, and we just hadn't noticed. Also, IMO the fact that getnameinfo() failure leads to a NULL output is pretty bogus in itself. Our pg_getnameinfo_all() wrapper is careful to emit "???" on failure, and we should use that in such cases. NULL should only be emitted in rows that don't have IP addresses. Per bug #16695 from Peter Vandivier. Back-patch to v10 where this code was added. Discussion: https://postgr.es/m/16695-a665558e2f630be7@postgresql.org
* Extend PageIsVerified() to handle more custom optionsMichael Paquier2020-11-02
| | | | | | | | | | | | | | | | | | | | | | This is useful for checks of relation pages without having to load the pages into the shared buffers, and two cases can make use of that: page verification in base backups and the online, lock-safe, flavor. Compatibility is kept with past versions using a routine that calls the new extended routine with the set of options compatible with the original version. Contrary to d401c576, a macro cannot be used as there may be external code relying on the presence of the original routine. This is applied down to 11, where this will be used by a follow-up commit addressing a set of issues with page verification in base backups. Extracted from a larger patch by the same author. Author: Anastasia Lubennikova Reviewed-by: Michael Paquier, Julien Rouhaud Discussion: https://postgr.es/m/608f3476-0598-2514-2c03-e05c7d2b0cbd@postgrespro.ru Backpatch-through: 11
* Preserve index data in pg_statistic across REINDEX CONCURRENTLYMichael Paquier2020-11-01
| | | | | | | | | | | | | | | | | | Statistics associated to an index got lost after running REINDEX CONCURRENTLY, while the non-concurrent case preserves these correctly. The concurrent and non-concurrent operations need to be consistent for the end-user, and missing statistics would force to wait for a new analyze to happen, which could take some time depending on the activity of the existing autovacuum workers. This issue is fixed by copying any existing entries in pg_statistic associated to the old index to the new one. Note that this copy is already done with the data of the index in the stats collector. Reported-by: Fabrízio de Royes Mello Author: Michael Paquier, Fabrízio de Royes Mello Reviewed-by: Justin Pryzby Discussion: https://postgr.es/m/CAFcNs+qpFPmiHd1oTXvcPdvAHicJDA9qBUSujgAhUMJyUMb+SA@mail.gmail.com Backpatch-through: 12
* Reproduce debug_query_string==NULL on parallel workers.Noah Misch2020-10-31
| | | | | | | | | | Certain background workers initiate parallel queries while debug_query_string==NULL, at which point they attempted strlen(NULL) and died to SIGSEGV. Older debug_query_string observers allow NULL, so do likewise in these newer ones. Back-patch to v11, where commit 7de4a1bcc56f494acbd0d6e70781df877dc8ecb5 introduced the first of these. Discussion: https://postgr.es/m/20201014022636.GA1962668@rfd.leadboat.com
* Calculate extraUpdatedCols in query rewriter, not parser.Tom Lane2020-10-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's unsafe to do this at parse time because addition of generated columns to a table would not invalidate stored rules containing UPDATEs on the table ... but there might now be dependent generated columns that were not there when the rule was made. This also fixes an oversight that rewriteTargetView failed to update extraUpdatedCols when transforming an UPDATE on an updatable view. (Since the new calculation is downstream of that, rewriteTargetView doesn't actually need to do anything; but before, there was a demonstrable bug there.) In v13 and HEAD, this leads to easily-visible bugs because (since commit c6679e4fc) we won't recalculate generated columns that aren't listed in extraUpdatedCols. In v12 this bitmap is mostly just used for trigger-firing decisions, so you'd only notice a problem if a trigger cared whether a generated column had been updated. I'd complained about this back in May, but then forgot about it until bug #16671 from Michael Paul Killian revived the issue. Back-patch to v12 where this field was introduced. If existing stored rules contain any extraUpdatedCols values, they'll be ignored because the rewriter will overwrite them, so the bug will be fixed even for existing rules. (But note that if someone were to update to 13.1 or 12.5, store some rules with UPDATEs on tables having generated columns, and then downgrade to a prior minor version, they might observe issues similar to what this patch fixes. That seems unlikely enough to not be worth going to a lot of effort to fix.) Discussion: https://postgr.es/m/10206.1588964727@sss.pgh.pa.us Discussion: https://postgr.es/m/16671-2fa55851859fb166@postgresql.org
* Fix use-after-free bug with event triggers and ALTER TABLE.Tom Lane2020-10-27
| | | | | | | | | | | | | | | | | EventTriggerAlterTableEnd neglected to make sure that it built its output list in the right context. In simple cases this was masked because the function is called in PortalContext which will be sufficiently long-lived anyway; but that doesn't make it not a bug. Commit ced138e8c fixed this in HEAD and v13, but mistakenly chose not to back-patch further. Back-patch the same code change all the way (I didn't bother with the test case though, as it would prove nothing in pre-v13 branches). Per report from Arseny Sher. Original fix by Jehan-Guillaume de Rorthais. Discussion: https://postgr.es/m/877drcyprb.fsf@ars-thinkpad Discussion: https://postgr.es/m/20200902193715.6e0269d4@firost
* Accept relations of any kind in LOCK TABLEAlvaro Herrera2020-10-27
| | | | | | | | | | | | | | | The restriction that only tables and views can be locked by LOCK TABLE is quite arbitrary, since the underlying mechanism can lock any relation type. Drop the restriction so that programs such as pg_dump can lock all relations they're interested in, preventing schema changes that could cause a dump to fail after expending much effort. Backpatch to 9.5. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reported-by: Wells Oliver <wells.oliver@gmail.com> Discussion: https://postgr.es/m/20201021200659.GA32358@alvherre.pgsql
* Fix corner case for a BEFORE ROW UPDATE trigger returning OLD.Tom Lane2020-10-25
| | | | | | | | | | | | | | | | | | | | | | | | | If the old row has any "missing" attributes that are supposed to be retrieved from an associated tuple descriptor, the wrong things happened because the trigger result is shoved directly into an executor slot that lacks the missing-attribute data. Notably, CHECK-constraint verification would incorrectly see those columns as NULL, and so would RETURNING-list evaluation. Band-aid around this by forcibly expanding the tuple before passing it to the trigger function. (IMO it was a fundamental misdesign to put the missing-attribute data into tuple constraints, which so much of the system considers to be optional. But we're probably stuck with that now, and will have to continue to apply band-aids as we find other places with similar issues.) Back-patch to v12. v11 would also have the issue, except that commit 920311ab1 already applied a similar band-aid. That forced expansion in more cases than seem really necessary, though, so this isn't a directly equivalent fix. Amit Langote, with some cosmetic changes by me Discussion: https://postgr.es/m/16644-5da7ef98a7ac4545@postgresql.org
* Fix incorrect parameter name in a function header commentDavid Rowley2020-10-25
| | | | | | Author: Zhijie Hou Discussion: https://postgr.es/m/14cd74ea00204cc8a7ea5d738ac82cd1@G08CNEXMBPEKD05.g08.fujitsu.local Backpatch-through: 12, where the mistake was introduced
* Fix ALTER TABLE .. ENABLE/DISABLE TRIGGER recursionAlvaro Herrera2020-10-20
| | | | | | | | | | | | | | | | More precisely, correctly handle the ONLY flag indicating not to recurse. This was implemented in 86f575948c77 by recursing in trigger.c, but that's the wrong place; use ATSimpleRecursion instead, which behaves properly. However, because legacy inheritance has never recursed in that situation, make sure to do that only for new-style partitioning. I noticed this problem while testing a fix for another bug in the vicinity. This has been wrong all along, so backpatch to 11. Discussion: https://postgr.es/m/20201016235925.GA29829@alvherre.pgsql
* Avoid invalid alloc size error in shm_mqPeter Eisentraut2020-10-20
| | | | | | | | | | | | In shm_mq_receive(), a huge payload could trigger an unjustified "invalid memory alloc request size" error due to the way the buffer size is increased. Add error checks (documenting the upper limit) and avoid the error by limiting the allocation size to MaxAllocSize. Author: Markus Wanner <markus.wanner@2ndquadrant.com> Discussion: https://www.postgresql.org/message-id/flat/3bb363e7-ac04-0ac4-9fe8-db1148755bfa%402ndquadrant.com
* Relax some asserts in merge join costing codeDavid Rowley2020-10-20
| | | | | | | | | | | | | | | | | | | | | | | | In the planner, it was possible, given an extreme enough case containing a large number of joins for the number of estimated rows to become infinite. This could cause problems in initial_cost_mergejoin() where we perform some calculations based on those row estimates. A problem case, presented by Onder Kalaci showed an Assert failure from an Assert checking outerstartsel <= outerendsel. In his test case this was effectively NaN <= Inf, which is false. The NaN outerstartsel came from multiplying the infinite outer_path_rows by 0.0. In master, this problem was fixed by a90c950fc, however, that fix was too invasive for the backbranches. Here we just relax the Asserts to allow them to pass. The worst that appears to happen from this is that we show NaN cost values and infinite row estimates in EXPLAIN. add_path() would have had a hard time doing anything useful with such costs, but that does not really matter as if the row estimates were even close to accurate, such plan would not complete this side of the heat death of the universe. Reported-by: Onder Kalaci Backpatch: 9.5 to 13 Discussion: https://postgr.es/m/DM6PR21MB1211FF360183BCA901B27F04D80B0@DM6PR21MB1211.namprd21.prod.outlook.com
* llvmjit: Work around bug in LLVM 3.9 causing crashes after 72559438f92.Andres Freund2020-10-15
| | | | | | | | | | | | | | Unfortunately in LLVM 3.9 LLVMGetAttributeCountAtIndex(func, index) crashes when called with an index that has 0 attributes. Since there's no way to work around this in the C API, add a small C++ wrapper doing so. The only reason this didn't fail before 72559438f92 is that there always are function attributes... Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20201016001254.w2nfj7gd74jmb5in@alap3.anarazel.de Backpatch: 11-, like 72559438f92
* llvmjit: Also copy parameter / return value attributes from template functions.Andres Freund2020-10-15
| | | | | | | | | | | | | | | | | | | | | Previously we only copied the function attributes. That caused problems at least on s390x: Because we didn't copy the 'zeroext' attribute for ExecAggTransReparent()'s *IsNull parameters, expressions invoking it didn't ensure that the upper bytes of the registers were zeroed. In the - relatively rare - cases where not, ExecAggTransReparent() wrongly ended up in the newValueIsNull branch due to the register not being zero. Subsequently causing a crash. It's quite possible that this would cause problems on other platforms, and in other places than just ExecAggTransReparent() on s390x. Thanks to Christoph (and the Debian project) for providing me with access to a s390x machine, allowing me to debug this. Reported-By: Christoph Berg Author: Andres Freund Discussion: https://postgr.es/m/20201015083246.kie5726xerdt3ael@alap3.anarazel.de Backpatch: 11-, where JIT was added
* In the postmaster, rely on the signal infrastructure to block signals.Tom Lane2020-10-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | POSIX sigaction(2) can be told to block a set of signals while a signal handler executes. Make use of that instead of manually blocking and unblocking signals in the postmaster's signal handlers. This should save a few cycles, but more importantly it prevents recursive invocation of signal handlers when many signals arrive in close succession. (Assuming that the platform's signal infrastructure is designed to avoid consuming stack space in that case, but this is demonstrably true at least on Linux.) The existing code has been seen to recurse to the point of stack overflow, either in the postmaster or in a forked-off child. Back-patch of commit 9abb2bfc0. At the time, we'd only seen excess postmaster stack consumption in the buildfarm; but we now have a user report of it, and that commit has aged enough to have a fair amount of confidence that it doesn't break anything. This still doesn't change anything about the way that it works on Windows. Perhaps someone else would like to fix that? Per bug #16673 from David Geier. Back-patch to 9.6. Although the problem exists in principle before that, we've only seen it actually materialize in connection with heavy use of parallel workers, so it doesn't seem necessary to do anything in 9.5; and the relevant code is different there, too. Discussion: https://postgr.es/m/16673-d278c604f8e34ec0@postgresql.org Discussion: https://postgr.es/m/14878.1570820201@sss.pgh.pa.us
* Fix GiST buffering build to work when there are included columns.Tom Lane2020-10-12
| | | | | | | | | | | | | | gistRelocateBuildBuffersOnSplit did not get the memo about which attribute count to use. This could lead to a crash if there were included columns and buffering build was chosen. (Because there are random page-split decisions elsewhere in GiST index build, the crashes are not entirely deterministic.) Back-patch to v12 where GiST gained support for included columns. Pavel Borisov Discussion: https://postgr.es/m/CALT9ZEECCV5m7wvxg46PC-7x-EybUmnpupBGhSFMoAAay+r6HQ@mail.gmail.com
* Fix memory leak when guc.c decides a setting can't be applied now.Tom Lane2020-10-12
| | | | | | | | | | | | | | | | | | | | The prohibitValueChange code paths in set_config_option(), which are executed whenever we re-read a PGC_POSTMASTER variable from postgresql.conf, neglected to free anything before exiting. Thus we'd leak the proposed new value of a PGC_STRING variable, as noted by BoChen in bug #16666. For all variable types, if the check hook creates an "extra" chunk, we'd also leak that. These are malloc not palloc chunks, so there is no mechanism for recovering the leaks before process exit. Fortunately, the values are typically not very large, meaning you'd have to go through an awful lot of SIGHUP configuration-reload cycles to make the leakage amount to anything. Still, for a long-lived postmaster process it could potentially be a problem. Oversight in commit 2594cf0e8. Back-patch to all supported branches. Discussion: https://postgr.es/m/16666-2c41a4eec61b03e1@postgresql.org
* Fix optimization hazard in gram.y's makeOrderedSetArgs(), redux.Tom Lane2020-10-07
| | | | | | | | | | | | | | | | It appears that commit cf63c641c, which intended to prevent misoptimization of the result-building step in makeOrderedSetArgs, didn't go far enough: buildfarm member hornet's version of xlc is now optimizing back to the old, broken behavior in which list_length(directargs) is fetched only after list_concat() has changed that value. I'm not entirely convinced whether that's an undeniable compiler bug or whether it can be justified by a sufficiently aggressive interpretation of C sequence points. So let's just change the code to make it harder to misinterpret. Back-patch to all supported versions, just in case. Discussion: https://postgr.es/m/1830491.1601944935@sss.pgh.pa.us
* Build EC members for child join rels in the right memory context.Tom Lane2020-10-06
| | | | | | | | | | | | | | | | | This patch prevents crashes or wrong plans when partition-wise joins are considered during GEQO planning, as a consequence of the EquivalenceClass data structures becoming corrupt after a GEQO context reset. A remaining problem is that successive GEQO cycles will make multiple copies of the required EC members, since add_child_join_rel_equivalences has no idea that such members might exist already. For now we'll just live with that. The lack of field complaints of crashes suggests that this is a mighty little-used situation. Back-patch to v12 where this code was introduced. Discussion: https://postgr.es/m/1683100.1601860653@sss.pgh.pa.us
* Fix two latent(?) bugs in equivclass.c.Tom Lane2020-10-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | get_eclass_for_sort_expr() computes expr_relids and nullable_relids early on, even though they won't be needed unless we make a new EquivalenceClass, which we often don't. Aside from the probably-minor inefficiency, there's a memory management problem: these bitmapsets will be built in the caller's context, leading to dangling pointers if that is shorter-lived than root->planner_cxt. This would be a live bug if get_eclass_for_sort_expr() could be called with create_it = true during GEQO join planning. So far as I can find, the core code never does that, but it's hard to be sure that no extensions do, especially since the comments make it clear that that's supposed to be a supported case. Fix by not computing these values until we've switched into planner_cxt to build the new EquivalenceClass. generate_join_implied_equalities() uses inner_rel->relids to look up relevant eclasses, but it ought to be using nominal_inner_relids. This is presently harmless because a child RelOptInfo will always have exactly the same eclass_indexes as its topmost parent; but that might not be true forever, and anyway it makes the code confusing. The first of these is old (introduced by me in f3b3b8d5b), so back-patch to all supported branches. The second only dates to v13, but we might as well back-patch it to keep the code looking similar across branches. Discussion: https://postgr.es/m/1508010.1601832581@sss.pgh.pa.us
* Reword partitioning error messageAlvaro Herrera2020-09-30
| | | | | | | | | | The error message about columns in the primary key not including all of the partition key was unclear; reword it. Backpatch all the way to pg11, where it appeared. Reported-by: Nagaraj Raj <nagaraj.sf@yahoo.com> Discussion: https://postgr.es/m/64062533.78364.1601415362244@mail.yahoo.com
* Fix handling of BC years in to_date/to_timestamp.Tom Lane2020-09-30
| | | | | | | | | | | | | | | | | | | | | | | Previously, a conversion such as to_date('-44-02-01','YYYY-MM-DD') would result in '0045-02-01 BC', as the code attempted to interpret the negative year as BC, but failed to apply the correction needed for our internal handling of BC years. Fix the off-by-one problem. Also, arrange for the combination of a negative year and an explicit "BC" marker to cancel out and produce AD. This is how the negative-century case works, so it seems sane to do likewise. Continue to read "year 0000" as 1 BC. Oracle would throw an error, but we've accepted that case for a long time so I'm hesitant to change it in a back-patch. Per bug #16419 from Saeed Hubaishan. Back-patch to all supported branches. Dar Alathar-Yemen and Tom Lane Discussion: https://postgr.es/m/16419-d8d9db0a7553f01b@postgresql.org
* Archive timeline history files in standby if archive_mode is set to "always".Fujii Masao2020-09-29
| | | | | | | | | | | | | | | | | | | | Previously the standby server didn't archive timeline history files streamed from the primary even when archive_mode is set to "always", while it archives the streamed WAL files. This could cause the PITR to fail because there was no required timeline history file in the archive. The cause of this issue was that walreceiver didn't mark those files as ready for archiving. This commit makes walreceiver mark those streamed timeline history files as ready for archiving if archive_mode=always. Then the archiver process archives the marked timeline history files. Back-patch to all supported versions. Reported-by: Grigory Smolkin Author: Grigory Smolkin, Fujii Masao Reviewed-by: David Zhang, Anastasia Lubennikova Discussion: https://postgr.es/m/54b059d4-2b48-13a4-6f43-95a087c92367@postgrespro.ru
* Fix progress reporting of REINDEX CONCURRENTLYMichael Paquier2020-09-29
| | | | | | | | | | | | | | | | This addresses a couple of issues with the so-said subject: - Report the correct parent relation with the index actually being rebuilt or validated. Previously, the command status remained set to the last index created for the progress of the index build and validation, which would be incorrect when working on a table that has more than one index. - Use the correct phase when waiting before the drop of the old indexes. Previously, this was reported with the same status as when waiting before the old indexes are marked as dead. Author: Matthias van de Meent, Michael Paquier Discussion: https://postgr.es/m/CAEze2WhqFgcwe1_tv=sFYhLWV2AdpfukumotJ6JNcAOQs3jufg@mail.gmail.com Backpatch-through: 12
* Assign collations in partition bound expressions.Tom Lane2020-09-28
| | | | | | | | | | Failure to do this can result in errors during evaluation of the bound expression, as illustrated by the new regression test. Back-patch to v12 where the ability for partition bounds to be expressions was added. Discussion: https://postgr.es/m/CAJV4CdrZ5mKuaEsRSbLf2URQ3h6iMtKD=hik8MaF5WwdmC9uZw@mail.gmail.com
* Revise RelationBuildRowSecurity() to avoid memory leaks.Tom Lane2020-09-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | This function leaked some memory while loading qual clauses for an RLS policy. While ordinarily negligible, that could build up in some repeated-reload cases, as reported by Konstantin Knizhnik. We can improve matters by borrowing the coding long used in RelationBuildRuleLock: build stringToNode's result directly in the target context, and remember to explicitly pfree the input string. This patch by no means completely guarantees zero leaks within this function, since we have no real guarantee that the catalog- reading subroutines it calls don't leak anything. However, practical tests suggest that this is enough to resolve the issue. In any case, any remaining leaks are similar to those risked by RelationBuildRuleLock and other relcache-loading subroutines. If we need to fix them, we should adopt a more global approach such as that used by the RECOVER_RELATION_BUILD_MEMORY hack. While here, let's remove the need for an expensive PG_TRY block by using MemoryContextSetParent to reparent an initially-short-lived context for the RLS data. Back-patch to all supported branches. Discussion: https://postgr.es/m/21356c12-8917-8249-b35f-1c447231922b@postgrespro.ru
* Fix missing fsync of SLRU directories.Thomas Munro2020-09-24
| | | | | | | | | | | | | Harmonize behavior by moving reponsibility for fsyncing directories down into slru.c. In 10 and later, only the multixact directories were missed (see commit 1b02be21), and in older branches all SLRUs were missed. Back-patch to all supported releases. Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CA%2BhUKGLtsTUOScnNoSMZ-2ZLv%2BwGh01J6kAo_DM8mTRq1sKdSQ%40mail.gmail.com
* Avoid possible dangling-pointer access in tsearch_readline_callback.Tom Lane2020-09-23
| | | | | | | | | | | | | | | | | | | | | | | | | | tsearch_readline() saves the string pointer it returns to the caller for possible use in the associated error context callback. However, the caller will usually pfree that string sometime before it next calls tsearch_readline(), so that there is a window where an ereport will try to print an already-freed string. The built-in users of tsearch_readline() happen to all do that pfree at the bottoms of their loops, so that the window is effectively empty for them. However, this is not documented as a requirement, and contrib/dict_xsyn doesn't do it like that, so it seems likely that third-party dictionaries might have live bugs here. The practical consequences of this seem pretty limited in any case, since production builds wouldn't clobber the freed string immediately, besides which you'd not expect syntax errors in dictionary files being used in production. Still, it's clearly a bug waiting to bite somebody. Fix by pstrdup'ing the string to be saved for the error callback, and then pfree'ing it next time through. It's been like this for a long time, so back-patch to all supported branches. Discussion: https://postgr.es/m/48A4FA71-524E-41B9-953A-FD04EF36E2E7@yesql.se
* Fix whitespacePeter Eisentraut2020-09-20
|
* Update parallel BTree scan state when the scan keys can't be satisfied.Amit Kapila2020-09-17
| | | | | | | | | | | | | | For parallel btree scan to work for array of scan keys, it should reach BTPARALLEL_DONE state once for every distinct combination of array keys. This is required to ensure that the parallel workers don't try to seize blocks at the same time for different scan keys. We missed to update this state when we discovered that the scan keys can't be satisfied. Author: James Hunter Reviewed-by: Amit Kapila Tested-by: Justin Pryzby Backpatch-through: 10, where it was introduced Discussion: https://postgr.es/m/4248CABC-25E3-4809-B4D0-128E1BAABC3C@amazon.com
* Avoid unnecessary recursion to child tables in ALTER TABLE SET NOT NULL.Tom Lane2020-09-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a partitioned table's column is already marked NOT NULL, there is no need to examine its partitions, because we can rely on previous DDL to have enforced that the child columns are NOT NULL as well. (Unfortunately, the same cannot be said for traditional inheritance, so for now we have to restrict the optimization to partitioned tables.) Hence, we may skip recursing to child tables in this situation. The reason this case is worth worrying about is that when pg_dump dumps a partitioned table having a primary key, it will include the requisite NOT NULL markings in the CREATE TABLE commands, and then add the primary key as a separate step. The primary key addition generates a SET NOT NULL as a subcommand, just to be sure. So the situation where a SET NOT NULL is redundant does arise in the real world. Skipping the recursion does more than just save a few cycles: it means that a command such as "ALTER TABLE ONLY partition_parent ADD PRIMARY KEY" will take locks only on the partition parent table, not on the partitions. It turns out that parallel pg_restore is effectively assuming that that's true, and has little choice but to do so because the dependencies listed for such a TOC entry don't include the partitions. pg_restore could thus issue this ALTER while data restores on the partitions are still in progress. Taking unnecessary locks on the partitions not only hurts concurrency, but can lead to actual deadlock failures, as reported by Domagoj Smoljanovic. (A contributing factor in the deadlock is that TRUNCATE on a child partition wants a non-exclusive lock on the parent. This seems likewise unnecessary, but the fix for it is more invasive so we won't consider back-patching it. Fortunately, getting rid of one of these two poor behaviors is enough to remove the deadlock.) Although support for partitioned primary keys came in with v11, this patch is dependent on the SET NOT NULL refactoring done by commit f4a3fdfbd, so we can only patch back to v12. Patch by me; thanks to Alvaro Herrera and Amit Langote for review. Discussion: https://postgr.es/m/VI1PR03MB31670CA1BD9625C3A8C5DD05EB230@VI1PR03MB3167.eurprd03.prod.outlook.com
* Fix bogus cache-invalidation logic in logical replication worker.Tom Lane2020-09-16
| | | | | | | | | | | | | | | | | | | | | | | | | | The code recorded cache invalidation events by zeroing the "localreloid" field of affected cache entries. However, it's possible for an inval event to occur even while we have the entry open and locked. So an ill-timed inval could result in "cache lookup failed for relation 0" errors, if the worker's code tried to use the cleared field. We can fix that by creating a separate bool field to record whether the entry needs to be revalidated. (In the back branches, cram the bool into what had been padding space, to avoid an ABI break in the somewhat unlikely event that any extension is looking at this struct.) Also, rearrange the logic in logicalrep_rel_open so that it does the right thing in cases where table_open would fail. We should retry the lookup by name in that case, but we didn't. The real-world impact of this is probably small. In the first place, the error conditions are very low probability, and in the second place, the worker would just exit and get restarted. We only noticed because in a CLOBBER_CACHE_ALWAYS build, the failure can occur repeatedly, preventing the worker from making progress. Nonetheless, it's clearly a bug, and it impedes a useful type of testing; so back-patch to v10 where this code was introduced. Discussion: https://postgr.es/m/1032727.1600096803@sss.pgh.pa.us
* Use the properly transformed RangeVar for expandTableLikeClause().Tom Lane2020-09-13
| | | | | | | | | | | | | | | transformCreateStmt() adjusts the transformed statement's RangeVar to specify the target schema explicitly, for the express reason of making sure that auxiliary statements derived by parse transformation operate on the right table. But the refactoring I did in commit 502898192 got this wrong and passed the untransformed RangeVar to expandTableLikeClause(). This could lead to assertion failures or weird misbehavior if the wrong table was accessed. Per report from Alexander Lakhin. Like the previous patch, back-patch to all supported branches. Discussion: https://postgr.es/m/05051f9d-b32b-cb35-6735-0e9f2ab86b5f@gmail.com
* Use _exit(2) for SIGQUIT during ProcessStartupPacket, too.Tom Lane2020-09-10
| | | | | | | | | | | | | | | | | | | | | Bring the signal handling for startup-packet collection into line with the policy established in commits bedadc732 and 8e19a8264, namely don't risk running atexit callbacks when handling SIGQUIT. Ideally, we'd not do so for SIGTERM or timeout interrupts either, but that change seems a bit too risky for the back branches. For now, just improve the comments in this area to describe the risk. Also relocate where BackendInitialize re-disables these interrupts, to minimize the code span where they're active. This doesn't buy a whole lot of safety, but it can't hurt. In passing, rename startup_die() to remove confusion about whether it is for the startup process. Like the previous commits, back-patch to all supported branches. Discussion: https://postgr.es/m/1850884.1599601164@sss.pgh.pa.us
* doc: Fix some grammar and inconsistenciesMichael Paquier2020-09-10
| | | | | | | | Some comments are fixed while on it. Author: Justin Pryzby Discussion: https://postgr.es/m/20200818171702.GK17022@telsasoft.com Backpatch-through: 9.6
* Make archiver's SIGQUIT handler exit via _exit().Tom Lane2020-09-09
| | | | | | | | | | | | | | | | | | Commit 8e19a8264 changed the SIGQUIT handlers of almost all server processes not to run atexit callbacks. The archiver process was skipped, perhaps because it's not connected to shared memory; but it's just as true here that running atexit callbacks in a signal handler is unsafe. So let's make it work like the rest. In HEAD and v13, we can use the common SignalHandlerForCrashExit handler. Before that, just tweak pgarch_exit to use _exit(2) explicitly. Like the previous commit, back-patch to all supported branches. Kyotaro Horiguchi, back-patching by me Discussion: https://postgr.es/m/1850884.1599601164@sss.pgh.pa.us
* Check default partitions constraints while descendingAlvaro Herrera2020-09-08
| | | | | | | | | | | | | | | | | | | | | | | Partitioning tuple route code assumes that the partition chosen while descending the partition hierarchy is always the correct one. This is true except when the partition is the default partition and another partition has been added concurrently: the partition constraint changes and we don't recheck it. This can lead to tuples mistakenly being added to the default partition that should have been rejected. Fix by rechecking the default partition constraint while descending the hierarchy. An isolation test based on the reproduction steps described by Hao Wu (with tweaks for extra coverage) is included. Backpatch to 12, where this bug came in with 898e5e3290a7. Reported by: Hao Wu <hawu@vmware.com> Author: Amit Langote <amitlangote09@gmail.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/CA+HiwqFqBmcSSap4sFnCBUEL_VfOMmEKaQ3gwUhyfa4c7J_-nA@mail.gmail.com Discussion: https://postgr.es/m/DM5PR0501MB3910E97A9EDFB4C775CF3D75A42F0@DM5PR0501MB3910.namprd05.prod.outlook.com
* Fix misleading error message about inconsistent moving-aggregate types.Tom Lane2020-09-06
| | | | | | | | | | | | | We reported the wrong types when complaining that an aggregate's moving-aggregate implementation is inconsistent with its regular implementation. This was wrong since the feature was introduced, so back-patch to all supported branches. Jeff Janes Discussion: https://postgr.es/m/CAMkU=1x808LH=LPhZp9mNSP0Xd1xDqEd+XeGcvEe48dfE6xV=A@mail.gmail.com
* Fix over-eager ping'ing in logical replication receiver.Tom Lane2020-09-04
| | | | | | | | | | | Commit 3f60f690f only partially fixed the broken-status-tracking issue in LogicalRepApplyLoop: we need ping_sent to have the same lifetime as last_recv_timestamp. The effects are much less serious than what that commit fixed, though. AFAICS this would just lead to extra ping requests being sent, once per second until the sender responds. Still, it's a bug, so backpatch to v10 as before. Discussion: https://postgr.es/m/959627.1599248476@sss.pgh.pa.us
* Avoid lockup of a parallel worker when reporting a long error message.Tom Lane2020-09-03
| | | | | | | | | | | | | | | | | | | | | Because sigsetjmp() will restore the initial state with signals blocked, the code path in bgworker.c for reporting an error and exiting would execute that way. Usually this is fairly harmless; but if a parallel worker had an error message exceeding the shared-memory communication buffer size (16K) it would lock up, because it would wait for a resume-sending signal from its parallel leader which it would never detect. To fix, just unblock signals at the appropriate point. This can be shown to fail back to 9.6. The lack of parallel query infrastructure makes it difficult to provide a simple test case for 9.5; but I'm pretty sure the issue exists in some form there as well, so apply the code change there too. Vignesh C, reviewed by Bharath Rupireddy, Robert Haas, and myself Discussion: https://postgr.es/m/CALDaNm1d1hHPZUg3xU4XjtWBOLCrA+-2cJcLpw-cePZ=GgDVfA@mail.gmail.com
* Raise error on concurrent drop of partitioned indexAlvaro Herrera2020-09-01
| | | | | | | | | | | | | | | | | | We were already raising an error for DROP INDEX CONCURRENTLY on a partitioned table, albeit a different and confusing one: ERROR: DROP INDEX CONCURRENTLY must be first action in transaction Change that to throw a more comprehensible error: ERROR: cannot drop partitioned index \"%s\" concurrently Michael Paquier authored the test case for indexes on temporary partitioned tables. Backpatch to 11, where indexes on partitioned tables were added. Reported-by: Jan Mussler <jan.mussler@zalando.de> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/16594-d2956ca909585067@postgresql.org
* Fix code for re-finding scan position in a multicolumn GIN index.Tom Lane2020-08-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | collectMatchBitmap() needs to re-find the index tuple it was previously looking at, after transiently dropping lock on the index page it's on. The tuple should still exist and be at its prior position or somewhere to the right of that, since ginvacuum never removes tuples but concurrent insertions could add one. However, there was a thinko in that logic, to the effect of expecting any inserted tuples to have the same index "attnum" as what we'd been scanning. Since there's no physical separation of tuples with different attnums, it's not terribly hard to devise scenarios where this fails, leading to transient "lost saved point in index" errors. (While I've duplicated this with manual testing, it seems impossible to make a reproducible test case with our available testing technology.) Fix by just continuing the scan when the attnum doesn't match. While here, improve the error message used if we do fail, so that it matches the wording used in btree for a similar case. collectMatchBitmap()'s posting-tree code path was previously not exercised at all by our regression tests. While I can't make a regression test that exhibits the bug, I can at least improve the code coverage here, so do that. The test case I made for this is an extension of one added by 4b754d6c1, so it only works in HEAD and v13; didn't seem worth trying hard to back-patch it. Per bug #16595 from Jesse Kinkead. This has been broken since multicolumn capability was added to GIN (commit 27cb66fdf), so back-patch to all supported branches. Discussion: https://postgr.es/m/16595-633118be8eef9ce2@postgresql.org
* Avoid pushing quals down into sub-queries that have grouping sets.Tom Lane2020-08-22
| | | | | | | | | | | | | | | | | | | | | | | | | | The trouble with doing this is that an apparently-constant subquery output column isn't really constant if it is a grouping column that appears in only some of the grouping sets. A qual using such a column would be subject to incorrect const-folding after push-down, as seen in bug #16585 from Paul Sivash. To fix, just disable qual pushdown altogether if the sub-query has nonempty groupingSets. While we could imagine far less restrictive solutions, there is not much point in working harder right now, because subquery_planner() won't move HAVING clauses to WHERE within such a subquery. If the qual stays in HAVING it's not going to be a lot more useful than if we'd kept it at the outer level. Having said that, this restriction could be removed if we used a parsetree representation that distinguished such outputs from actual constants, which is something I hope to do in future. Hence, make the patch a minimal addition rather than integrating it more tightly (e.g. by renumbering the existing items in subquery_is_pushdown_safe's comment). Back-patch to 9.5 where grouping sets were introduced. Discussion: https://postgr.es/m/16585-9d8c340d23ade8c1@postgresql.org
* Fix handling of CREATE TABLE LIKE with inheritance.Tom Lane2020-08-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a CREATE TABLE command uses both LIKE and traditional inheritance, Vars in CHECK constraints and expression indexes that are absorbed from a LIKE parent table tended to get mis-numbered, resulting in wrong answers and/or bizarre error messages (though probably not any actual crashes, thanks to validation occurring in the executor). In v12 and up, the same could happen to Vars in GENERATED expressions, even in cases with no LIKE clause but multiple traditional-inheritance parents. The cause of the problem for LIKE is that parse_utilcmd.c supposed it could renumber such Vars correctly during transformCreateStmt(), which it cannot since we have not yet accounted for columns added via inheritance. Fix that by postponing processing of LIKE INCLUDING CONSTRAINTS, DEFAULTS, GENERATED, INDEXES till after we've performed DefineRelation(). The error with GENERATED and multiple inheritance is a simple oversight in MergeAttributes(); it knows it has to renumber Vars in inherited CHECK constraints, but forgot to apply the same processing to inherited GENERATED expressions (a/k/a defaults). Per bug #16272 from Tom Gottfried. The non-GENERATED variants of the issue are ancient, presumably dating right back to the addition of CREATE TABLE LIKE; hence back-patch to all supported branches. Discussion: https://postgr.es/m/16272-6e32da020e9a9381@postgresql.org
* Fix a few typos in JIT comments and READMEDavid Rowley2020-08-21
| | | | | | | Reviewed-by: Abhijit Menon-Sen Reviewed-by: Andres Freund Discussion: https://postgr.es/m/CAApHDvobgmCs6CohqhKTUf7D8vffoZXQTCBTERo9gbOeZmvLTw%40mail.gmail.com Backpatch-through: 11, where JIT was added
* Suppress unnecessary RelabelType nodes in yet more cases.Tom Lane2020-08-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit a477bfc1d fixed eval_const_expressions() to ensure that it didn't generate unnecessary RelabelType nodes, but I failed to notice that some other places in the planner had the same issue. Really noplace in the planner should be using plain makeRelabelType(), for fear of generating expressions that should be equal() to semantically equivalent trees, but aren't. An example is that because canonicalize_ec_expression() failed to be careful about this, we could end up with an equivalence class containing both a plain Const, and a Const-with-RelabelType representing exactly the same value. So far as I can tell this led to no visible misbehavior, but we did waste a bunch of cycles generating and evaluating "Const = Const-with-RelabelType" to prove such entries are redundant. Hence, move the support function added by a477bfc1d to where it can be more generally useful, and use it in the places where planner code previously used makeRelabelType. Back-patch to v12, like the previous patch. While I have no concrete evidence of any real misbehavior here, it's certainly possible that I overlooked a case where equivalent expressions that aren't equal() could cause a user-visible problem. In any case carrying extra RelabelType nodes through planning to execution isn't very desirable. Discussion: https://postgr.es/m/1311836.1597781384@sss.pgh.pa.us
* Move new LOCKTAG_DATABASE_FROZEN_IDS to end of enum LockTagType.Noah Misch2020-08-15
| | | | | | | | | | | Several PGXN modules reference LockTagType values; renumbering would force a recompile of those modules. Oversight in back-patch of today's commit 566372b3d6435639e4cc4476d79b8505a0297c87. Back-patch to released branches, v12 through 9.5. Reported by Tom Lane. Discussion: https://postgr.es/m/921383.1597523945@sss.pgh.pa.us
* Prevent concurrent SimpleLruTruncate() for any given SLRU.Noah Misch2020-08-15
| | | | | | | | | | | | | | | | | The SimpleLruTruncate() header comment states the new coding rule. To achieve this, add locktype "frozenid" and two LWLocks. This closes a rare opportunity for data loss, which manifested as "apparent wraparound" or "could not access status of transaction" errors. Data loss is more likely in pg_multixact, due to released branches' thin margin between multiStopLimit and multiWrapLimit. If a user's physical replication primary logged ": apparent wraparound" messages, the user should rebuild standbys of that primary regardless of symptoms. At less risk is a cluster having emitted "not accepting commands" errors or "must be vacuumed" warnings at some point. One can test a cluster for this data loss by running VACUUM FREEZE in every database. Back-patch to 9.5 (all supported versions). Discussion: https://postgr.es/m/20190218073103.GA1434723@rfd.leadboat.com
* Be more careful about the shape of hashable subplan clauses.Tom Lane2020-08-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nodeSubplan.c expects that the testexpr for a hashable ANY SubPlan has the form of one or more OpExprs whose LHS is an expression of the outer query's, while the RHS is an expression over Params representing output columns of the subquery. However, the planner only went as far as verifying that the clauses were all binary OpExprs. This works 99.99% of the time, because the clauses have the right shape when emitted by the parser --- but it's possible for function inlining to break that, as reported by PegoraroF10. To fix, teach the planner to check that the LHS and RHS contain the right things, or more accurately don't contain the wrong things. Given that this has been broken for years without anyone noticing, it seems sufficient to just give up hashing when it happens, rather than go to the trouble of commuting the clauses back again (which wouldn't necessarily work anyway). While poking at that, I also noticed that nodeSubplan.c had a baked-in assumption that the number of hash clauses is identical to the number of subquery output columns. Again, that's fine as far as parser output goes, but it's not hard to break it via function inlining. There seems little reason for that assumption though --- AFAICS, the only thing it's buying us is not having to store the number of hash clauses explicitly. Adding code to the planner to reject such cases would take more code than getting nodeSubplan.c to cope, so I fixed it that way. This has been broken for as long as we've had hashable SubPlans, so back-patch to all supported branches. Discussion: https://postgr.es/m/1549209182255-0.post@n3.nabble.com
* Fix postmaster's behavior during smart shutdown.Tom Lane2020-08-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Up to now, upon receipt of a SIGTERM ("smart shutdown" command), the postmaster has immediately killed all "optional" background processes, and subsequently refused to launch new ones while it's waiting for foreground client processes to exit. No doubt this seemed like an OK policy at some point; but it's a pretty bad one now, because it makes for a seriously degraded environment for the remaining clients: * Parallel queries are killed, and new ones fail to launch. (And our parallel-query infrastructure utterly fails to deal with the case in a reasonable way --- it just hangs waiting for workers that are not going to arrive. There is more work needed in that area IMO.) * Autovacuum ceases to function. We can tolerate that for awhile, but if bulk-update queries continue to run in the surviving client sessions, there's eventually going to be a mess. In the worst case the system could reach a forced shutdown to prevent XID wraparound. * The bgwriter and walwriter are also stopped immediately, likely resulting in performance degradation. Hence, let's rearrange things so that the only immediate change in behavior is refusing to let in new normal connections. Once the last normal connection is gone, shut everything down as though we'd received a "fast" shutdown. To implement this, remove the PM_WAIT_BACKUP and PM_WAIT_READONLY states, instead staying in PM_RUN or PM_HOT_STANDBY while normal connections remain. A subsidiary state variable tracks whether or not we're letting in new connections in those states. This also allows having just one copy of the logic for killing child processes in smart and fast shutdown modes. I moved that logic into PostmasterStateMachine() by inventing a new state PM_STOP_BACKENDS. Back-patch to 9.6 where parallel query was added. In principle this'd be a good idea in 9.5 as well, but the risk/reward ratio is not as good there, since lack of autovacuum is not a problem during typical uses of smart shutdown. Per report from Bharath Rupireddy. Patch by me, reviewed by Thomas Munro Discussion: https://postgr.es/m/CALj2ACXAZ5vKxT9P7P89D87i3MDO9bfS+_bjMHgnWJs8uwUOOw@mail.gmail.com