aboutsummaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAge
...
* Update time zone data files to tzdata release 2019a.Tom Lane2019-04-26
| | | | | | | | | | DST law changes in Palestine and Metlakatla. Historical corrections for Israel. Etc/UCT is now a backward-compatibility link to Etc/UTC, instead of being a separate zone that generates the abbreviation "UCT", which nowadays is typically a typo. Postgres will still accept "UCT" as an input zone name, but it won't output it.
* Apply stopgap fix for bug #15672.Tom Lane2019-04-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix DefineIndex so that it doesn't attempt to pass down a to-be-reused index relfilenode to a child index creation, and fix TryReuseIndex to not think that reuse is sensible for a partitioned index. In v11, this fixes a problem where ALTER TABLE on a partitioned table could assign the same relfilenode to several different child indexes, causing very nasty catalog corruption --- in fact, attempting to DROP the partitioned table then leads not only to a database crash, but to inability to restart because the same crash will recur during WAL replay. Either of these two changes would be enough to prevent the failure, but since neither action could possibly be sane, let's put in both changes for future-proofing. In HEAD, no such bug manifests, but that's just an accidental consequence of having changed the pg_class representation of partitioned indexes to have relfilenode = 0. Both of these changes still seem like smart future-proofing. This is only a stop-gap because the code for ALTER TABLE on a partitioned table with a no-op type change still leaves a great deal to be desired. As the added regression tests show, it gets things wrong for comments on child indexes/constraints, and it is regenerating child indexes it doesn't have to. However, fixing those problems will take more work which may not get back-patched into v11. We need a fix for the corruption problem now. Per bug #15672 from Jianing Yang. Patch by me, regression test cases based on work by Amit Langote, who also did a lot of the investigative work. Discussion: https://postgr.es/m/15672-b9fa7db32698269f@postgresql.org
* Fix partitioned index attachmentAlvaro Herrera2019-04-25
| | | | | | | | | | | | | | When an existing index in a partition is attached to a new index on its parent, we forgot to set the "relispartition" flag correctly, which meant that it was not possible to find the index in various operations, such as adding a foreign key constraint that references that partitioned table. One of four places that was assigning the parent index was forgetting to do that, so fix by shifting responsibility of updating the flag to the routine that changes the parent. Author: Amit Langote, Álvaro Herrera Reported-by: Hubert "depesz" Lubaczewski Discussion: https://postgr.es/m/CA+HiwqHMsRtRYRWYTWavKJ8x14AFsv7bmAV46mYwnfD3vy8goQ@mail.gmail.com
* Make pg_dump emit ATTACH PARTITION instead of PARTITION OFAlvaro Herrera2019-04-24
| | | | | | | | | | | | | | | | | | | | | | | Using PARTITION OF can result in column ordering being changed from the database being dumped, if the partition uses a column layout different from the parent's. It's not pg_dump's job to editorialize on table definitions, so this is not acceptable; back-patch all the way back to pg10, where partitioned tables where introduced. This change also ensures that partitions end up in the correct tablespace, if different from the parent's; this is an oversight in ca4103025dfe (in pg12 only). Partitioned indexes (in pg11) don't have this problem, because they're already created as independent indexes and attached to their parents afterwards. This change also has the advantage that the partition is restorable from the dump (as a standalone table) even if its parent table isn't restored. Author: David Rowley Reviewed-by: Álvaro Herrera Discussion: https://postgr.es/m/CAKJS1f_1c260nOt_vBJ067AZ3JXptXVRohDVMLEBmudX1YEx-A@mail.gmail.com Discussion: https://postgr.es/m/20190423185007.GA27954@alvherre.pgsql
* Fix some minor postmaster-state-machine issues.Tom Lane2019-04-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In sigusr1_handler, don't ignore PMSIGNAL_ADVANCE_STATE_MACHINE based on pmState. The restriction is unnecessary (PostmasterStateMachine should work in any state), not future-proof (since it makes too many assumptions about why the signal might be sent), and broken even today because a race condition can make it necessary to respond to the signal in PM_WAIT_READONLY state. The race condition seems unlikely, but if it did happen, a hot-standby postmaster could fail to shut down after receiving a smart-shutdown request. In MaybeStartWalReceiver, don't clear the WalReceiverRequested flag if the fork attempt fails. Leaving it set allows us to try again in future iterations of the postmaster idle loop. (The startup process would eventually send a fresh request signal, but this change may allow us to retry the fork sooner.) Remove an obsolete comment and unnecessary test in PostmasterStateMachine's handling of PM_SHUTDOWN_2 state. It's not possible to have a live walreceiver in that state, and AFAICT has not been possible since commit 5e85315ea. This isn't a live bug, but the false comment is quite confusing to readers. In passing, rearrange sigusr1_handler's CheckPromoteSignal tests so that we don't uselessly perform stat() calls that we're going to ignore the results of. Add some comments clarifying the behavior of MaybeStartWalReceiver; I very nearly rearranged it in a way that'd reintroduce the race condition fixed in e5d494d78. Mea culpa for not commenting that properly at the time. Back-patch to all supported branches. The PMSIGNAL_ADVANCE_STATE_MACHINE change is the only one of even minor significance, but we might as well keep this code in sync across branches. Discussion: https://postgr.es/m/9001.1556046681@sss.pgh.pa.us
* Repair assorted issues in locale data extraction.Tom Lane2019-04-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cache_locale_time (extraction of LC_TIME-related info) had never been taught the lessons we previously learned about extraction of info related to LC_MONETARY and LC_NUMERIC. Specifically, commit 95a777c61 taught PGLC_localeconv() that data coming out of localeconv() was in an encoding determined by the relevant locale, but we didn't realize that there's a similar issue with strftime(). And commit a4930e7ca hardened PGLC_localeconv() against errors occurring partway through, but failed to do likewise for cache_locale_time(). So, rearrange the latter function to perform encoding conversion and not risk failure while it's got the locales set to temporary values. This time around I also changed PGLC_localeconv() to treat it as FATAL if it can't restore the previous settings of the locale values. There is no reason (except possibly OOM) for that to fail, and proceeding with the wrong locale values seems like a seriously bad idea --- especially on Windows where we have to also temporarily change LC_CTYPE. Also, protect against the possibility that we can't identify the codeset reported for LC_MONETARY or LC_NUMERIC; rather than just failing, try to validate the data without conversion. The user-visible symptom this fixes is that if LC_TIME is set to a locale name that implies an encoding different from the database encoding, non-ASCII localized day and month names would be retrieved in the wrong encoding, leading to either unexpected encoding-conversion error reports or wrong output from to_char(). The other possible failure modes are unlikely enough that we've not seen reports of them, AFAIK. The encoding conversion problems do not manifest on Windows, since we'd already created special-case code to handle that issue there. Per report from Juan José Santamaría Flecha. Back-patch to all supported versions. Juan José Santamaría Flecha and Tom Lane Discussion: https://postgr.es/m/CAC+AXB22So5aZm2vZe+MChYXec7gWfr-n-SK-iO091R0P_1Tew@mail.gmail.com
* Fix detection of passwords hashed with MD5 or SCRAM-SHA-256Michael Paquier2019-04-23
| | | | | | | | | | | | | | | | | | | | | | This commit fixes a couple of issues related to the way password verifiers hashed with MD5 or SCRAM-SHA-256 are detected, leading to being able to store in catalogs passwords which do not follow the supported hash formats: - A MD5-hashed entry was checked based on if its header uses "md5" and if the string length matches what is expected. Unfortunately the code never checked if the hash only used hexadecimal characters, as reported by Tom Lane. - A SCRAM-hashed entry was checked based on only its header, which should be "SCRAM-SHA-256$", but it never checked for any fields afterwards, as reported by Jonathan Katz. Backpatch down to v10, which is where SCRAM has been introduced, and where password verifiers in plain format have been removed. Author: Jonathan Katz Reviewed-by: Tom Lane, Michael Paquier Discussion: https://postgr.es/m/016deb6b-1f0a-8e9f-1833-a8675b170aa9@postgresql.org Backpatch-through: 10
* Fix problems with auto-held portals.Tom Lane2019-04-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | HoldPinnedPortals() did things in the wrong order: it must not mark a portal autoHeld until it's been successfully held. Otherwise, a failure while persisting the portal results in a server crash because we think the portal is in a good state when it's not. Also add a check that portal->status is READY before attempting to hold a pinned portal. We have such a check before the only other use of HoldPortal(), so it seems unwise not to check it here. Lastly, rethink the responsibility for where to call HoldPinnedPortals. The comment for it imagined that it was optional for any individual PL to call it or not, but that cannot be the case: if some outer level of procedure has a pinned portal, failing to persist it when an inner procedure commits is going to be trouble. Let's have SPI do it instead of the individual PLs. That's not a complete solution, since in theory a PL might not be using SPI to perform commit/rollback, but such a PL is going to have to be aware of lots of related requirements anyway. (This change doesn't cause an API break for any external PLs that might be calling HoldPinnedPortals per the old regime, because calling it twice during a commit or rollback sequence won't hurt.) Per bug #15703 from Julian Schauder. Back-patch to v11 where this code came in. Discussion: https://postgr.es/m/15703-c12c5bc0ea34ba26@postgresql.org
* Fix handling of temp and unlogged tables in FOR ALL TABLES publicationsPeter Eisentraut2019-04-18
| | | | | | | | | | | If a FOR ALL TABLES publication exists, temporary and unlogged tables are ignored for publishing changes. But CheckCmdReplicaIdentity() would still check in that case that such a table has a replica identity set before accepting updates. To fix, have GetRelationPublicationActions() return that such a table publishes no actions. Discussion: https://www.postgresql.org/message-id/f3f151f7-c4dd-1646-b998-f60bd6217dd3@2ndquadrant.com
* postgresql.conf.sample: add proper defaults for include actionsBruce Momjian2019-04-17
| | | | | | | | | | | | | Previously, include actions include_dir, include_if_exists, and include listed commented-out values which were not the defaults, which is inconsistent with other entries. Instead, replace them with '', which is the default value. Reported-by: Emanuel Araújo Discussion: https://postgr.es/m/CAMuTAkYMx6Q27wpELDR3_v9aG443y7ZjeXu15_+1nGUjhMWOJA@mail.gmail.com Backpatch-through: 9.4
* Fix unportable code in pgbench.Tom Lane2019-04-17
| | | | | | | | | The buildfarm points out that UINT64_FORMAT might not work with sscanf; it's calibrated for our printf implementation, which might not agree with the platform-supplied sscanf. Fall back to just accepting an unsigned long, which is already more than the documentation promises. Oversight in e6c3ba7fb; back-patch to v11, as that was.
* Don't write to stdin of a test process that could have already exited.Noah Misch2019-04-15
| | | | | | | Instead, close that stdin. Per buildfarm member conchuela. Back-patch to 9.6, where the test was introduced. Discussion: https://postgr.es/m/26478.1555373328@sss.pgh.pa.us
* Fix division by zero in _bt_vacuum_needs_cleanup()Alexander Korotkov2019-04-15
| | | | | | | | | | | Checks inside _bt_vacuum_needs_cleanup() allow division by zero to happen when metad->btm_last_cleanup_num_heap_tuples == 0. This commit adjusts the expression so that no division by zero might happen. Reported-by: Piotr Stefaniak Discussion: https://postgr.es/m/DB8PR03MB5931C41F7787A95313F08322F22A0%40DB8PR03MB5931.eurprd03.prod.outlook.com Reviewed-by: Masahiko Sawada Backpatch-through: 11
* Fix SHOW ALL command for non-superusers with replication connectionMichael Paquier2019-04-15
| | | | | | | | | | | | | | | | | | | | | | Since Postgres 10, SHOW commands can be triggered with replication connections in a WAL sender context, however it missed that a transaction context is needed for syscache lookups. This commit makes sure that the syscache lookups can happen correctly by setting a transaction context when running SHOW commands in a WAL sender. Superuser-only parameters can be displayed using SHOW commands not only to superusers, but also to members of system role pg_read_all_settings, which requires a syscache lookup to check if the connected role is a member of this system role or not, or the instance crashes. Superusers do not need to check the syscache so it worked correctly in this case. New tests are added to cover this issue. Reported-by: Alexander Kukushkin Author: Michael Paquier Reviewed-by: Álvaro Herrera Discussion: https://postgr.es/m/15734-2daa8761eeed8e20@postgresql.org Backpatch-through: 10
* Test both 0.0.0.0 and 127.0.0.x addresses to find a usable port.Noah Misch2019-04-14
| | | | | | | | | | | | | Commit c098509927f9a49ebceb301a2cb6a477ecd4ac3c changed PostgresNode::get_new_node() to probe 0.0.0.0 instead of 127.0.0.1, but the new test was less effective for Windows native Perl. This increased the failure rate of buildfarm members bowerbird and jacana. Instead, test 0.0.0.0 and concrete addresses. This restores the old level of defense, but the algorithm is still subject to its longstanding time of check to time of use race condition. Back-patch to 9.6, like the previous change. Discussion: https://postgr.es/m/GrdLgAdUK9FdyZg8VIcTDKVOkys122ZINEb3CjjoySfGj2KyPiMKTh1zqtRp0TAD7FJ27G-OBB3eplxIB5GhcQH5o8zzGZfp0MuJaXJxVxk=@yesql.se
* MSYS: Translate REGRESS_SHLIB to a Windows file name.Noah Misch2019-04-14
| | | | | | | Per buildfarm member jacana. Back-patch to v11; earlier branches skip the affected test under msys. Discussion: https://postgr.es/m/GrdLgAdUK9FdyZg8VIcTDKVOkys122ZINEb3CjjoySfGj2KyPiMKTh1zqtRp0TAD7FJ27G-OBB3eplxIB5GhcQH5o8zzGZfp0MuJaXJxVxk=@yesql.se
* When Perl "kill(9, ...)" fails, try "pg_ctl kill".Noah Misch2019-04-13
| | | | | | | Per buildfarm member jacana, the former fails under msys Perl 5.8.8. Back-patch to 9.6, like the code in question. Discussion: https://postgr.es/m/GrdLgAdUK9FdyZg8VIcTDKVOkys122ZINEb3CjjoySfGj2KyPiMKTh1zqtRp0TAD7FJ27G-OBB3eplxIB5GhcQH5o8zzGZfp0MuJaXJxVxk=@yesql.se
* Prevent memory leaks associated with relcache rd_partcheck structures.Tom Lane2019-04-13
| | | | | | | | | | | | | | | | | | | | | | | | | | The original coding of generate_partition_qual() just copied the list of predicate expressions into the global CacheMemoryContext, making it effectively impossible to clean up when the owning relcache entry is destroyed --- the relevant code in RelationDestroyRelation() only managed to free the topmost List header :-(. This resulted in a session-lifespan memory leak whenever a table partition's relcache entry is rebuilt. Fortunately, that's not normally a large data structure, and rebuilds shouldn't occur all that often in production situations; but this is still a bug worth fixing back to v10 where the code was introduced. To fix, put the cached expression tree into its own small memory context, as we do with other complicated substructures of relcache entries. Also, deal more honestly with the case that a partition has an empty partcheck list; while that probably isn't a case that's very interesting for production use, it's legal. In passing, clarify comments about how partitioning-related relcache data structures are managed, and add some Asserts that we're not leaking old copies when we overwrite these data fields. Amit Langote and Tom Lane Discussion: https://postgr.es/m/7961.1552498252@sss.pgh.pa.us
* Consistently test for in-use shared memory.Noah Misch2019-04-12
| | | | | | | | | | | | | | | | | | | | | | postmaster startup scrutinizes any shared memory segment recorded in postmaster.pid, exiting if that segment matches the current data directory and has an attached process. When the postmaster.pid file was missing, a starting postmaster used weaker checks. Change to use the same checks in both scenarios. This increases the chance of a startup failure, in lieu of data corruption, if the DBA does "kill -9 `head -n1 postmaster.pid` && rm postmaster.pid && pg_ctl -w start". A postmaster will no longer stop if shmat() of an old segment fails with EACCES. A postmaster will no longer recycle segments pertaining to other data directories. That's good for production, but it's bad for integration tests that crash a postmaster and immediately delete its data directory. Such a test now leaks a segment indefinitely. No "make check-world" test does that. win32_shmem.c already avoided all these problems. In 9.6 and later, enhance PostgresNode to facilitate testing. Back-patch to 9.4 (all supported versions). Reviewed (in earlier versions) by Daniel Gustafsson and Kyotaro HORIGUCHI. Discussion: https://postgr.es/m/20190408064141.GA2016666@rfd.leadboat.com
* Fix off-by-one check that can lead to a memory overflow in ecpg.Michael Meskes2019-04-11
| | | | Patch by Liu Huailing <liuhuailing@cn.fujitsu.com>
* Fix backwards test in operator_precedence_warning logic.Tom Lane2019-04-10
| | | | | | | | | | Warnings about unary minus might have been wrong. It's a bit surprising that nobody noticed yet ... probably the precedence-warning feature hasn't really been used much in the field. Rikard Falkeborn Discussion: https://postgr.es/m/CADRDgG6fzA8A2oeygUw4=o7ywo4kvz26NxCSgpq22nMD73Bx4Q@mail.gmail.com
* Avoid counting transaction stats for parallel worker cooperatingAmit Kapila2019-04-10
| | | | | | | | | | | | | | | | | | | | | | transaction. The transaction that is initiated by the parallel worker to cooperate with the actual transaction started by the main backend to complete the query execution should not be counted as a separate transaction. The other internal transactions started and committed by the parallel worker are still counted as separate transactions as we that is what we do in other places like autovacuum. This will partially fix the bloat in transaction stats due to additional transactions performed by parallel workers. For a complete fix, we need to decide how we want to show all the transactions that are started internally for various operations and that is a matter of separate patch. Reported-by: Haribabu Kommi Author: Haribabu Kommi Reviewed-by: Amit Kapila, Jamison Kirk and Rahila Syed Backpatch-through: 9.6 Discussion: https://postgr.es/m/CAJrrPGc9=jKXuScvNyQ+VNhO0FZk7LLAShAJRyZjnedd2D61EQ@mail.gmail.com
* Define WIN32_STACK_RLIMIT throughout win32 and cygwin builds.Noah Misch2019-04-09
| | | | | | | | The MSVC build system already did this, and commit 617dc6d299c957e2784320382b3277ede01d9c63 used it in a second file. Back-patch to 9.4, like that commit. Discussion: https://postgr.es/m/CAA8=A7_1SWc3+3Z=-utQrQFOtrj_DeohRVt7diA2tZozxsyUOQ@mail.gmail.com
* Avoid "could not reattach" by providing space for concurrent allocation.Noah Misch2019-04-08
| | | | | | | | | | | | | | | | | We've long had reports of intermittent "could not reattach to shared memory" errors on Windows. Buildfarm member dory fails that way when PGSharedMemoryReAttach() execution overlaps with creation of a thread for the process's "default thread pool". Fix that by providing a second region to receive asynchronous allocations that would otherwise intrude into UsedShmemSegAddr. In pgwin32_ReserveSharedMemoryRegion(), stop trying to free reservations landing at incorrect addresses; the caller's next step has been to terminate the affected process. Back-patch to 9.4 (all supported versions). Reviewed by Tom Lane. He also did much of the prerequisite research; see commit bcbf2346d69f6006f126044864dd9383d50d87b4. Discussion: https://postgr.es/m/20190402135442.GA1173872@rfd.leadboat.com
* Fix improper interaction of FULL JOINs with lateral references.Tom Lane2019-04-08
| | | | | | | | | | | | | | | | | | | | | join_is_legal() needs to reject forming certain outer joins in cases where that would lead the planner down a blind alley. However, it mistakenly supposed that the way to handle full joins was to treat them as applying the same constraints as for left joins, only to both sides. That doesn't work, as shown in bug #15741 from Anthony Skorski: given a lateral reference out of a join that's fully enclosed by a full join, the code would fail to believe that any join ordering is legal, resulting in errors like "failed to build any N-way joins". However, we don't really need to consider full joins at all for this purpose, because we effectively force them to be evaluated in syntactic order, and that order is always legal for lateral references. Hence, get rid of this broken logic for full joins and just ignore them instead. This seems to have been an oversight in commit 7e19db0c0. Back-patch to all supported branches, as that was. Discussion: https://postgr.es/m/15741-276f1f464b3f40eb@postgresql.org
* Fix EvalPlanQualStart to handle partitioned result rels correctly.Tom Lane2019-04-08
| | | | | | | | | | | | The es_root_result_relations array needs to be shallow-copied in the same way as the main es_result_relations array, else EPQ rechecks on partitioned result relations fail, as seen in bug #15677 from Norbert Benkocs. Amit Langote, isolation test case added by me Discussion: https://postgr.es/m/15677-0bf089579b4cd02d@postgresql.org Discussion: https://postgr.es/m/19321.1554567786@sss.pgh.pa.us
* Fix partition tuple routing with dropped attributesMichael Paquier2019-04-08
| | | | | | | | | | | | | | | | | | | | | | When trying to insert a tuple into a partitioned table, the routing to the correct partition has been messed up by mixing when a tuple needs to be stored in an intermediate parent's slot and when a tuple needs to be converted because of attribute changes between the immediate parent relation and the parent relation one level above that (the grandparent). This could trigger errors like the following: ERROR: cannot extract attribute from empty tuple slot SQL state: XX000 This was not detected because regression tests with dropped attributes only included tests with two levels of partitioning, and this can be triggered with three levels or more. This fixes bug #15733, which has been introduced by 34295b8. The bug happens only on REL_11_STABLE and HEAD gains the regression tests added for this bug. Reported-by: Petr Fedorov Author: Amit Langote, Michael Paquier Discussion: https://postgr.es/m/15733-7692379e310b80ec@postgresql.org
* Avoid fetching past the end of the indoption array.Tom Lane2019-04-07
| | | | | | | | | | | pg_get_indexdef_worker carelessly fetched indoption entries even for non-key index columns that don't have one. 99.999% of the time this would be harmless, since the code wouldn't examine the value ... but some fine day this will be a fetch off the end of memory, resulting in SIGSEGV. Detected through valgrind testing. Odd that the buildfarm's valgrind critters haven't noticed.
* Clean up side-effects of commits ab5fcf2b0 et al.Tom Lane2019-04-07
| | | | | | | | | | | | | | | | | | | | | | | Before those commits, partitioning-related code in the executor could assume that ModifyTableState.resultRelInfo[] contains only leaf partitions. However, now a fully-pruned update results in a dummy ModifyTable that references the root partitioned table, and that breaks some stuff. In v11, this led to an assertion or core dump in the tuple routing code. Fix by disabling tuple routing, since we don't need that anyway. (I chose to do that in HEAD as well for safety, even though the problem doesn't manifest in HEAD as it stands.) In v10, this confused ExecInitModifyTable's decision about whether it needed to close the root table. But we can get rid of that altogether by being smarter about where to find the root table. Note that since the referenced commits haven't shipped yet, this isn't fixing any bug the field has seen. Amit Langote, per a report from me Discussion: https://postgr.es/m/20710.1554582479@sss.pgh.pa.us
* Fix failures in validateForeignKeyConstraint's slow path.Tom Lane2019-04-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | The foreign-key-checking loop in ATRewriteTables failed to ignore relations without storage (e.g., partitioned tables), unlike the initial loop. This accidentally worked as long as RI_Initial_Check succeeded, which it does in most practical cases (including all the ones exercised in the existing regression tests :-(). However, if that failed, as for instance when there are permissions issues, then we entered the slow fire-the-trigger-on-each-tuple path. And that would try to read from the referencing relation, and fail if it lacks storage. A second problem, recently introduced in HEAD, was that this loop had been broken by sloppy refactoring for the tableam API changes. Repair both issues, and add a regression test case so we have some coverage on this code path. Back-patch as needed to v11. (It looks like this code could do with additional bulletproofing, but let's get a working test case in place first.) Hadi Moshayedi, Tom Lane, Andres Freund Discussion: https://postgr.es/m/CAK=1=WrnNmBbe5D9sm3t0a6dnAq3cdbF1vXY816j1wsMqzC8bw@mail.gmail.com Discussion: https://postgr.es/m/19030.1554574075@sss.pgh.pa.us Discussion: https://postgr.es/m/20190325180405.jytoehuzkeozggxx%40alap3.anarazel.de
* Revert "Consistently test for in-use shared memory."Noah Misch2019-04-05
| | | | | | | | | This reverts commits 2f932f71d9f2963bbd201129d7b971c8f5f077fd, 16ee6eaf80a40007a138b60bb5661660058d0422 and 6f0e190056fe441f7cf788ff19b62b13c94f68f3. The buildfarm has revealed several bugs. Back-patch like the original commits. Discussion: https://postgr.es/m/20190404145319.GA1720877@rfd.leadboat.com
* Silence -Wimplicit-fallthrough in sysv_shmem.c.Noah Misch2019-04-03
| | | | | | | | | | Commit 2f932f71d9f2963bbd201129d7b971c8f5f077fd added code that elicits a warning on buildfarm member flaviventris. Back-patch to 9.4, like that commit. Reported by Andres Freund. Discussion: https://postgr.es/m/20190404020057.galelv7by75ekqrh@alap3.anarazel.de
* Make src/test/recovery/t/017_shm.pl safe for concurrent execution.Noah Misch2019-04-03
| | | | | | | | Buildfarm members idiacanthus and komodoensis, which share a host, both executed this test in the same second. That failed. Back-patch to 9.6, where the test first appeared. Discussion: https://postgr.es/m/20190404020543.GA1319573@rfd.leadboat.com
* Handle USE_MODULE_DB for all tests able to use an installed postmaster.Noah Misch2019-04-03
| | | | | | | | | | | | | | | | | | | | When $(MODULES) and $(MODULE_big) are empty, derive the database name from the first element of $(REGRESS) instead of using a constant string. When deriving the database name from $(MODULES), use its first element instead of the entire list; the earlier approach would fail if any multi-module directory had $(REGRESS) tests. Treat isolation suites and src/pl correspondingly. Under USE_MODULE_DB=1, installcheck-world and check-world no longer reuse any database name in a given postmaster. Buildfarm members axolotl, mandrill and frogfish saw spurious "is being accessed by other users" failures that would not have happened without database name reuse. (The CountOtherDBBackends() 5s deadline expired during DROP DATABASE; a backend for an earlier test suite had used the same database name and had not yet exited.) Back-patch to 9.4 (all supported versions), except bits pertaining to isolation suites. Concept reviewed by Andrew Dunstan, Andres Freund and Tom Lane. Discussion: https://postgr.es/m/20190401135213.GE891537@rfd.leadboat.com
* Consistently test for in-use shared memory.Noah Misch2019-04-03
| | | | | | | | | | | | | | | | | | | | | postmaster startup scrutinizes any shared memory segment recorded in postmaster.pid, exiting if that segment matches the current data directory and has an attached process. When the postmaster.pid file was missing, a starting postmaster used weaker checks. Change to use the same checks in both scenarios. This increases the chance of a startup failure, in lieu of data corruption, if the DBA does "kill -9 `head -n1 postmaster.pid` && rm postmaster.pid && pg_ctl -w start". A postmaster will no longer recycle segments pertaining to other data directories. That's good for production, but it's bad for integration tests that crash a postmaster and immediately delete its data directory. Such a test now leaks a segment indefinitely. No "make check-world" test does that. win32_shmem.c already avoided all these problems. In 9.6 and later, enhance PostgresNode to facilitate testing. Back-patch to 9.4 (all supported versions). Reviewed by Daniel Gustafsson and Kyotaro HORIGUCHI. Discussion: https://postgr.es/m/20130911033341.GD225735@tornado.leadboat.com
* Perform RLS subquery checks as the right user when going via a view.Dean Rasheed2019-04-02
| | | | | | | | | | | | | | When accessing a table with RLS via a view, the RLS checks are performed as the view owner. However, the code neglected to propagate that to any subqueries in the RLS checks. Fix that by calling setRuleCheckAsUser() for all RLS policy quals and withCheckOption checks for RTEs with RLS. Back-patch to 9.5 where RLS was added. Per bug #15708 from daurnimator. Discussion: https://postgr.es/m/15708-d65cab2ce9b1717a@postgresql.org
* Update HINT for pre-existing shared memory block.Noah Misch2019-03-31
| | | | | | | | | | | | One should almost always terminate an old process, not use a manual removal tool like ipcrm. Removal of the ipcclean script eleven years ago (39627b1ae680cba44f6e56ca5facec4fdbfe9495) and its non-replacement corroborate that manual shm removal is now a niche goal. Back-patch to 9.4 (all supported versions). Reviewed by Daniel Gustafsson and Kyotaro HORIGUCHI. Discussion: https://postgr.es/m/20180812064815.GB2301738@rfd.leadboat.com
* Have pg_upgrade's Makefile honor NO_TEMP_INSTALLAndrew Dunstan2019-03-31
| | | | | | Backpatch to 9.5, when pg_upgrade's location changed. Discussion: https://postgr.es/m/5506b8fa-7dad-8483-053c-7ca7ef04f01a@2ndQuadrant.com
* Avoid crash in partitionwise join planning under GEQO.Tom Lane2019-03-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While trying to plan a partitionwise join, we may be faced with cases where one or both input partitions for a particular segment of the join have been pruned away. In HEAD and v11, this is problematic because earlier processing didn't bother to make a pruned RelOptInfo fully valid. With an upcoming patch to make partition pruning more efficient, this'll be even more problematic because said RelOptInfo won't exist at all. The existing code attempts to deal with this by retroactively making the RelOptInfo fully valid, but that causes crashes under GEQO because join planning is done in a short-lived memory context. In v11 we could probably have fixed this by switching to the planner's main context while fixing up the RelOptInfo, but that idea doesn't scale well to the upcoming patch. It would be better not to mess with the base-relation data structures during join planning, anyway --- that's just a recipe for order-of-operations bugs. In many cases, though, we don't actually need the child RelOptInfo, because if the input is certainly empty then the join segment's result is certainly empty, so we can skip making a join plan altogether. (The existing code ultimately arrives at the same conclusion, but only after doing a lot more work.) This approach works except when the pruned-away partition is on the nullable side of a LEFT, ANTI, or FULL join, and the other side isn't pruned. But in those cases the existing code leaves a lot to be desired anyway --- the correct output is just the result of the unpruned side of the join, but we were emitting a useless outer join against a dummy Result. Pending somebody writing code to handle that more nicely, let's just abandon the partitionwise-join optimization in such cases. When the modified code skips making a join plan, it doesn't make a join RelOptInfo either; this requires some upper-level code to cope with nulls in part_rels[] arrays. We would have had to have that anyway after the upcoming patch. Back-patch to v11 since the crash is demonstrable there. Discussion: https://postgr.es/m/8305.1553884377@sss.pgh.pa.us
* Fix off-by-one error in txid_status().Thomas Munro2019-03-27
| | | | | | | | | | | | | The transaction ID returned by GetNextXidAndEpoch() is in the future, so we can't attempt to access its status or we might try to read a CLOG page that doesn't exist. The > vs >= confusion probably stemmed from the choice of a variable name containing the word "last" instead of "next", so fix that too. Back-patch to 10 where the function arrived. Author: Thomas Munro Discussion: https://postgr.es/m/CA%2BhUKG%2Buua_BV5cyfsioKVN2d61Lukg28ECsWTXKvh%3DBtN2DPA%40mail.gmail.com
* Track unowned relations in doubly-linked listTomas Vondra2019-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | Relations dropped in a single transaction are tracked in a list of unowned relations. With large number of dropped relations this resulted in poor performance at the end of a transaction, when the relations are removed from the singly linked list one by one. Commit b4166911 attempted to address this issue (particularly when it happens during recovery) by removing the relations in a reverse order, resulting in O(1) lookups in the list of unowned relations. This did not work reliably, though, and it was possible to trigger the O(N^2) behavior in various ways. Instead of trying to remove the relations in a specific order with respect to the linked list, which seems rather fragile, switch to a regular doubly linked. That allows us to remove relations cheaply no matter where in the list they are. As b4166911 was a bugfix, backpatched to all supported versions, do the same thing here. Reviewed-by: Alvaro Herrera Discussion: https://www.postgresql.org/message-id/flat/80c27103-99e4-1d0c-642c-d9f3b94aaa0a%402ndquadrant.com Backpatch-through: 9.4
* Fix partitioned index creation bug with dropped columnsAlvaro Herrera2019-03-26
| | | | | | | | | | | | | | | | | | | ALTER INDEX .. ATTACH PARTITION fails if the partitioned table where the index is defined contains more dropped columns than its partition, with this message: ERROR: incorrect attribute map The cause was that one caller of CompareIndexInfo was passing the number of attributes of the partition rather than the parent, which confused the length check. Repair. This can cause pg_upgrade to fail when used on such a database. Leave some more objects around after regression tests, so that the case is detected by pg_upgrade test suite. Remove some spurious empty lines noticed while looking for other cases of the same problem. Discussion: https://postgr.es/m/20190326213924.GA2322@alvherre.pgsql
* psql: Schema-qualify typecast in one \d queryAlvaro Herrera2019-03-26
| | | | Bug introduced in my commit bc87f22ef6ef
* Fix WAL format incompatibility introduced by backpatching of 52ac6cd2d0Alexander Korotkov2019-03-24
| | | | | | | | | | | | | | | | | | | 52ac6cd2d0 added new field to ginxlogDeletePage and was backpatched to 9.4. That led to problems when patched postgres instance applies WAL records generated by non-patched one. WAL records generated by non-patched instance don't contain new field, which patched one is expecting to see. Thankfully, we can distinguish patched and non-patched WAL records by their data size. If we see that WAL record is generated by non-patched instance, we skip processing of new field. This commit comes with some assertions. In particular, if it appears that on some platform struct data size didn't change then static assertion will trigger. Reported-by: Simon Riggs Discussion: https://postgr.es/m/CANP8%2Bj%2BK4whxf7ET7%2BgO%2BG-baC3-WxqqH%3DnV4X2CgfEPA3Yu3g%40mail.gmail.com Author: Alexander Korotkov Reviewed-by: Simon Riggs, Alvaro Herrera Backpatch-through: 9.4
* Make current_logfiles use permissions assigned to files in data directoryMichael Paquier2019-03-24
| | | | | | | | | | | | | | | | | | | | | | | Since its introduction in 19dc233c, current_logfiles has been assigned the same permissions as a log file, which can be enforced with log_file_mode. This setup can lead to incompatibility problems with group access permissions as current_logfiles is not located in the log directory, but at the root of the data folder. Hence, if group permissions are used but log_file_mode is more restrictive, a backup with a user in the group having read access could fail even if the log directory is located outside of the data folder. Per discussion with the folks mentioned below, we have concluded that current_logfiles should not be treated as a log file as it only stores metadata related to log files, and that it should use the same permissions as all other files in the data directory. This solution has the merit to be simple and fixes all the interaction problems between group access and log_file_mode. Author: Haribabu Kommi Reviewed-by: Stephen Frost, Robert Haas, Tom Lane, Michael Paquier Discussion: https://postgr.es/m/CAJrrPGcEotF1P7AWoeQyD3Pqr-0xkQg_Herv98DjbaMj+naozw@mail.gmail.com Backpatch-through: 11, where group access has been added.
* Remove inadequate check for duplicate "xml" PI.Tom Lane2019-03-23
| | | | | | I failed to think about PIs starting with "xml". We don't really need this check at all, so just take it out. Oversight in commit 8d1dadb25 et al.
* Ensure xmloption = content while restoring pg_dump output.Tom Lane2019-03-23
| | | | | | | | In combination with the previous commit, this ensures that valid XML data can always be dumped and reloaded, whether it is "document" or "content". Discussion: https://postgr.es/m/CAN-V+g-6JqUQEQZ55Q3toXEN6d5Ez5uvzL4VR+8KtvJKj31taw@mail.gmail.com
* Accept XML documents when xmloption = content, as required by SQL:2006+.Tom Lane2019-03-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | Previously we were using the SQL:2003 definition, which doesn't allow this, but that creates a serious dump/restore gotcha: there is no setting of xmloption that will allow all valid XML data. Hence, switch to the 2006 definition. Since libxml doesn't accept <!DOCTYPE> directives in the mode we use for CONTENT parsing, the implementation is to detect <!DOCTYPE> in the input and switch to DOCUMENT parsing mode. This should not cost much, because <!DOCTYPE> should be close to the front of the input if it's there at all. It's possible that this causes the error messages for malformed input to be slightly different than they were before, if said input includes <!DOCTYPE>; but that does not seem like a big problem. In passing, buy back a few cycles in parsing of large XML documents by not doing strlen() of the whole input in parse_xml_decl(). Back-patch because dump/restore failures are not nice. This change shouldn't break any cases that worked before, so it seems safe to back-patch. Chapman Flack (revised a bit by me) Discussion: https://postgr.es/m/CAN-V+g-6JqUQEQZ55Q3toXEN6d5Ez5uvzL4VR+8KtvJKj31taw@mail.gmail.com
* Restore RI trigger sanity checkAlvaro Herrera2019-03-20
| | | | | | | | I unnecessarily removed this check in 3de241dba86f because I misunderstood what the final representation of constraints across a partitioning hierarchy was to be. Put it back (in both branches). Discussion: https://postgr.es/m/201901222145.t6wws6t6vrcu@alvherre.pgsql
* Hack back-branch SSL tests to avoid intermittent buildfarm failures.Tom Lane2019-03-19
| | | | | | | | | | | | | | | | Buildfarm member eelpout sometimes reports the wrong error message for an SSL connection failure. In HEAD, this problem is believed to be solved by commit 1f39a1c06, but I'm as yet unwilling to back-patch that. The problem seems fairly unlikely to be an issue in the field, since (as far as we can tell) it happens only during a failure of a local-loopback SSL connection, and it's improbable even then. It seems better to just live with it for the time being; but let's tweak the regression test to accept the other error message as a "pass". Needed in v11 only, since older branches didn't check the message text anyway. Discussion: https://postgr.es/m/CAEepm=2n6Nv+5tFfe8YnkUm1fXgvxR0Mm1FoD+QKG-vLNGLyKg@mail.gmail.com