aboutsummaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAge
...
* Fix assignment to array of domain over composite.Tom Lane2021-10-19
| | | | | | | | | | | | | | | An update such as "UPDATE ... SET fld[n].subfld = whatever" failed if the array elements were domains rather than plain composites. That's because isAssignmentIndirectionExpr() failed to cope with the CoerceToDomain node that would appear in the expression tree in this case. The result would typically be a crash, and even if we accidentally didn't crash, we'd not correctly preserve other fields of the same array element. Per report from Onder Kalaci. Back-patch to v11 where arrays of domains came in. Discussion: https://postgr.es/m/PH0PR21MB132823A46AA36F0685B7A29AD8BD9@PH0PR21MB1328.namprd21.prod.outlook.com
* Remove bogus assertion in transformExpressionList().Tom Lane2021-10-19
| | | | | | | | | | | | | | | | | I think when I added this assertion (in commit 8f889b108), I was only thinking of the use of transformExpressionList at top level of INSERT and VALUES. But it's also called by transformRowExpr(), which can certainly occur in an UPDATE targetlist, so it's inappropriate to suppose that p_multiassign_exprs must be empty. Besides, since the input is not expected to contain ResTargets, there's no reason it should contain MultiAssignRefs either. Hence this code need not be concerned about the state of p_multiassign_exprs, and we should just drop the assertion. Per bug #17236 from ocean_li_996. It's been wrong for years, so back-patch to all supported branches. Discussion: https://postgr.es/m/17236-3210de9bcba1d7ca@postgresql.org
* Fix bug in TOC file error message printingDaniel Gustafsson2021-10-19
| | | | | | | | | | | | | | | | | | | | | | | | If the blob TOC file cannot be parsed, the error message was failing to print the filename as the variable holding it was shadowed by the destination buffer for parsing. When the filename fails to parse, the error will print an empty string: ./pg_restore -d foo -F d dump pg_restore: error: invalid line in large object TOC file "": .. ..instead of the intended error message: ./pg_restore -d foo -F d dump pg_restore: error: invalid line in large object TOC file "dump/blobs.toc": .. Fix by renaming both variables as the shared name was too generic to store either and still convey what the variable held. Backpatch all the way down to 9.6. Reviewed-by: Tom Lane Discussion: https://postgr.es/m/A2B151F5-B32B-4F2C-BA4A-6870856D9BDE@yesql.se Backpatch-through: 9.6
* Fix sscanf limits in pg_basebackup and pg_dumpDaniel Gustafsson2021-10-19
| | | | | | | | | | | | | | | | | | | | | Make sure that the string parsing is limited by the size of the destination buffer. In pg_basebackup the available values sent from the server is limited to two characters so there was no risk of overflow. In pg_dump the buffer is bounded by MAXPGPATH, and thus the limit must be inserted via preprocessor expansion and the buffer increased by one to account for the terminator. There is no risk of overflow here, since in this case, the buffer scanned is smaller than the destination buffer. Backpatch the pg_basebackup fix to 11 where it was introduced, and the pg_dump fix all the way down to 9.6. Reviewed-by: Tom Lane Discussion: https://postgr.es/m/B14D3D7B-F98C-4E20-9459-C122C67647FB@yesql.se Backpatch-through: 11 and 9.6
* Block ALTER INDEX/TABLE index_name ALTER COLUMN colname SET (options)Michael Paquier2021-10-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The grammar of this command run on indexes with column names has always been authorized by the parser, and it has never been documented. Since 911e702, it is possible to define opclass parameters as of CREATE INDEX, which actually broke the old case of ALTER INDEX/TABLE where relation-level parameters n_distinct and n_distinct_inherited could be defined for an index (see 76a47c0 and its thread where this point has been touched, still remained unused). Attempting to do that in v13~ would cause the index to become unusable, as there is a new dedicated code path to load opclass parameters instead of the relation-level ones previously available. Note that it is possible to fix things with a manual catalog update to bring the relation back online. This commit disables this command for now as the use of column names for indexes does not make sense anyway, particularly when it comes to index expressions where names are automatically computed. One way to properly support this case properly in the future would be to use column numbers when it comes to indexes, in the same way as ALTER INDEX .. ALTER COLUMN .. SET STATISTICS. Partitioned indexes were already blocked, but not indexes. Some tests are added for both cases. There was some code in ANALYZE to enforce n_distinct to be used for an index expression if the parameter was defined, but just remove it for now until/if there is support for this (note that index-level parameters never had support in pg_dump either, previously), so this was just dead code. Reported-by: Matthijs van der Vleuten Author: Nathan Bossart, Michael Paquier Reviewed-by: Vik Fearing, Dilip Kumar Discussion: https://postgr.es/m/17220-15d684c6c2171a83@postgresql.org Backpatch-through: 13
* Invalidate partitions of table being attached/detachedAlvaro Herrera2021-10-18
| | | | | | | | | | | | | | | | Failing to do that, any direct inserts/updates of those partitions would fail to enforce the correct constraint, that is, one that considers the new partition constraint of their parent table. Backpatch to 10. Reported by: Hou Zhijie <houzj.fnst@fujitsu.com> Author: Amit Langote <amitlangote09@gmail.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Nitin Jadhav <nitinjadhavpostgres@gmail.com> Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com> Discussion: https://postgr.es/m/OS3PR01MB5718DA1C4609A25186D1FBF194089%40OS3PR01MB5718.jpnprd01.prod.outlook.com
* Reset properly snapshot export state during transaction abortMichael Paquier2021-10-18
| | | | | | | | | | | | | | | | | | | | | | | | During a replication slot creation, an ERROR generated in the same transaction as the one creating a to-be-exported snapshot would have left the backend in an inconsistent state, as the associated static export snapshot state was not being reset on transaction abort, but only on the follow-up command received by the WAL sender that created this snapshot on replication slot creation. This would trigger inconsistency failures if this session tried to export again a snapshot, like during the creation of a replication slot. Note that a snapshot export cannot happen in a transaction block, so there is no need to worry resetting this state for subtransaction aborts. Also, this inconsistent state would very unlikely show up to users. For example, one case where this could happen is an out-of-memory error when building the initial snapshot to-be-exported. Dilip found this problem while poking at a different patch, that caused an error in this code path for reasons unrelated to HEAD. Author: Dilip Kumar Reviewed-by: Michael Paquier, Zhihong Yu Discussion: https://postgr.es/m/CAFiTN-s0zA1Kj0ozGHwkYkHwa5U0zUE94RSc_g81WrpcETB5=w@mail.gmail.com Backpatch-through: 9.6
* Avoid core dump in pg_dump when dumping from pre-8.3 server.Tom Lane2021-10-16
| | | | | | Commit f0e21f2f6 missed adding a tgisinternal output column to getTriggers' query for pre-8.3 servers. Back-patch to v11, like that commit.
* Make pg_dump acquire lock on partitioned tables that are to be dumped.Tom Lane2021-10-16
| | | | | | | | | | | | | | | | | | | | It was clearly the intent to do so all along, but the original coding fat-fingered this by checking the wrong array element. We fixed it in passing in 403a3d91c, but that later got reverted, and we forgot to keep this bug fix. Most of the time this'd be relatively harmless, since once we lock any of the partitioned table's leaf partitions, that would suffice to prevent major DDL on the partitioned table itself. However, a childless partitioned table would get dumped with no relevant lock whatsoever, possibly allowing dump failure or inconsistent output. Unlike 403a3d91c, there are no versioning concerns, since every server version that has partitioned tables will allow you to lock one. Back-patch to v10 where partitioned tables were introduced. Discussion: https://postgr.es/m/1018205.1634346327@sss.pgh.pa.us
* Check criticalSharedRelcachesBuilt in GetSharedSecurityLabel().Jeff Davis2021-10-14
| | | | | | | | | | An extension may want to call GetSecurityLabel() on a shared object before the shared relcaches are fully initialized. For instance, a ClientAuthentication_hook might want to retrieve the security label on a role. Discussion: https://postgr.es/m/ecb7af0b26e3be1d96d291c8453a86f1f82d9061.camel@j-davis.com Backpatch-through: 9.6
* Fix planner error with pulling up subquery expressions into function RTEs.Tom Lane2021-10-14
| | | | | | | | | | | | | | | | If a function-in-FROM laterally references the output of some sub-SELECT earlier in the FROM clause, and we are able to flatten that sub-SELECT into the outer query, the expression(s) copied into the function RTE missed being processed by eval_const_expressions. This'd lead to trouble and probable crashes at execution if such expressions contained named-argument function call syntax or functions with defaulted arguments. The bug is masked if the query contains any explicit JOIN syntax, which may help explain why we'd not noticed. Per bug #17227 from Bernd Dorn. This is an oversight in commit 7266d0997, so back-patch to v13 where that came in. Discussion: https://postgr.es/m/17227-5a28ed1512189fa4@postgresql.org
* Change recently added test code for stabilityAlvaro Herrera2021-10-13
| | | | | | | | | | | | | The test code added with ff9f111bce24 fails under valgrind, and probably other slow cases too, because if (say) autovacuum runs in between and produces WAL of its own, the large INSERT fails to account for that in the LSN calculations. Rewrite to use a DO loop. Per complaint from Andres Freund Backpatch to all branches. Discussion: https://postgr.es/m/20211013180338.5guyqzpkcisqugrl@alap3.anarazel.de
* Fix tests of pg_upgrade across different major versionsMichael Paquier2021-10-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes a set of issues that cause different breakages or annoyances when using pg_upgrade's test.sh to do upgrades across different major versions: - test.sh is completely broken when using v14 as new version because of the removal of testtablespace/ as Makefile rule. Older versions of pg_regress don't support --make-tablespacedir, blocking the creation of the tablespace. In order to fix that, it is simple enough to create those directories in the script itself, but only do that when an old version is involved. This fix is needed on HEAD and REL_14_STABLE. - The script would fail when using PG <= v11 as old version because of WITH OIDS relations not supported in v12. In order to fix this, this steals a method from the buildfarm that uses a DO block to change all the relations marked as WITH OIDS, allowing pg_upgrade to pass. This is more portable than using ALTER TABLE queries on the relations causing issues. This is fixed down to v12, and authored originally by Andrew Dunstan. - Not using --extra-float-digits=0 with v11 as old version causes a lot of diffs in the dumps, making the whole unreadable. This gets only done when using v11 as old version. This is fixed down to v12. The buildfarm code uses that already. Note that the addition of --wal-segsize and --allow-group-access breaks the script when using v10 or older at initdb time as these got added in 11. 10 would be EOL'd next year and nobody has complained about those problems yet, so nothing is done about that. This means that this commit fixes upgrade tests using test.sh with v11 as minimum older version, up to HEAD, and that it is enough to apply this change down to 12. The old and new dumps still generate diffs, still require manual checks, and more could be done to reduce the noise, but this allows the tests to run with a rather minimal amount of them. I have tested this commit and test.sh with v11 as minimum across all the branches where this is applied. Note that this commit has no impact on the normal pg_upgrade test run with a simple "make check". Author: Justin Pryzby, Andrew Dunstan, Michael Paquier Discussion: https://postgr.es/m/20201206180248.GI24052@telsasoft.com Backpatch-through: 12
* Add more $Test::Builder::Level in the TAP testsMichael Paquier2021-10-12
| | | | | | | | | | | | | | | | | | | | | Incrementing the level of the call stack reported is useful for debugging purposes as it allows to control which part of the test is exactly failing, especially if a test is structured with subroutines that call routines from Test::More. This adds more incrementations of $Test::Builder::Level where debugging gets improved (for example it does not make sense for some paths like pg_rewind where long subroutines are used). A note is added to src/test/perl/README about that, based on a suggestion from Andrew Dunstan and a wording coming from both of us. Usage of Test::Builder::Level has spread in 12, so a backpatch down to this version is done. Reviewed-by: Andrew Dunstan, Peter Eisentraut, Daniel Gustafsson Discussion: https://postgr.es/m/YV1CCFwgM1RV1LeS@paquier.xyz Backpatch-through: 12
* Add missing word to comment in joinrels.c.Etsuro Fujita2021-10-07
| | | | | | Author: Amit Langote Backpatch-through: 13 Discussion: https://postgr.es/m/CA%2BHiwqGQNbtamQ_9DU3osR1XiWR4wxWFZurPmN6zgbdSZDeWmw%40mail.gmail.com
* Fix corner-case loss of precision in numeric_power().Dean Rasheed2021-10-06
| | | | | | | | | | | | | | | | | | | | | | | | This fixes a loss of precision that occurs when the first input is very close to 1, so that its logarithm is very small. Formerly, during the initial low-precision calculation to estimate the result weight, the logarithm was computed to a local rscale that was capped to NUMERIC_MAX_DISPLAY_SCALE (1000). However, the base may be as close as 1e-16383 to 1, hence its logarithm may be as small as 1e-16383, and so the local rscale needs to be allowed to exceed 16383, otherwise all precision is lost, leading to a poor choice of rscale for the full-precision calculation. Fix this by removing the cap on the local rscale during the initial low-precision calculation, as we already do in the full-precision calculation. This doesn't change the fact that the initial calculation is a low-precision approximation, computing the logarithm to around 8 significant digits, which is very fast, especially when the base is very close to 1. Patch by me, reviewed by Alvaro Herrera. Discussion: https://postgr.es/m/CAEZATCV-Ceu%2BHpRMf416yUe4KKFv%3DtdgXQAe5-7S9tD%3D5E-T1g%40mail.gmail.com
* Fix warning in TAP test of pg_verifybackupMichael Paquier2021-10-06
| | | | | | | | Oversight in a3fcbcd. Reported-by: Thomas Munro Discussion: https://postgr.es/m/CA+hUKGKnajZEwe91OTjro9kQLCMGGFHh2vvFn8tgHgbyn4bF9w@mail.gmail.com Backpatch-through: 13
* Fix TestLib::slurp_file() with offset on windows.Andres Freund2021-10-04
| | | | | | | | | | | | | | 3c5b0685b921 used setFilePointer() to set the position of the filehandle, but passed the wrong filehandle, always leaving the position at 0. Instead of just fixing that, remove use of setFilePointer(), we have a perl fd at this point, so we can just use perl's seek(). Additionally, the perl filehandle wasn't closed, just the windows filehandle. Reviewed-By: Andrew Dunstan <andrew@dunslane.net> Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20211003173038.64mmhgxctfqn7wl6@alap3.anarazel.de Backpatch: 9.6-, like 3c5b0685b921
* Update our mapping of Windows time zone names some more.Tom Lane2021-10-04
| | | | | | | | | | | | | | | | | | | Per discussion, let's just follow CLDR's default zone mappings faithfully. There are two changes here that are clear improvements: * Mapping "Greenwich Standard Time" to Atlantic/Reykjavik is actually a better fit than using London, because Iceland hasn't observed DST since 1968, so this is more nearly what people might expect. * Since the "Samoa" zone is specified to be UTC+13:00, we must map it to Pacific/Apia not Pacific/Samoa; the latter refers to American Samoa which is now on the other side of the date line. The rest of these changes look like they're choosing the most populous IANA zone as representative. Whatever the details, we're just going to say "if you don't like this mapping, complain to CLDR". Discussion: https://postgr.es/m/3266414.1633045628@sss.pgh.pa.us
* Fix snapshot builds during promotion of hot standby node with 2PCMichael Paquier2021-10-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some specific logic is done at the end of recovery when involving 2PC transactions: 1) Call RecoverPreparedTransactions(), to recover the state of 2PC transactions into memory (re-acquire locks, etc.). 2) ShutdownRecoveryTransactionEnvironment(), to move back to normal operations, mainly cleaning up recovery locks and KnownAssignedXids (including any 2PC transaction tracked previously). 3) Switch XLogCtl->SharedRecoveryState to RECOVERY_STATE_DONE, which is the tipping point for any process calling RecoveryInProgress() to check if the cluster is still in recovery or not. Any snapshot taken between steps 2) and 3) would be empty, causing any transaction relying on a snapshot at this point to potentially corrupt data as there could still be some 2PC transactions to track, with RecentXmin moving backwards on successive calls to GetSnapshotData() in the same transaction. As SharedRecoveryState is the point to take into account to know if it is safe to discard KnownAssignedXids, this commit moves step 2) after step 3), so as we can never finish with empty snapshots. This exists since the introduction of hot standby, so backpatch all the way down. The window with incorrect snapshots is extremely small, but I have seen it when running 023_pitr_prepared_xact.pl, as did buildfarm member fairywren. Thomas Munro also found it independently. Special thanks to Andres Freund for taking the time to analyze this issue. Reported-by: Thomas Munro, Michael Paquier Analyzed-by: Andres Freund Discussion: https://postgr.es/m/20210422203603.fdnh3fu2mmfp2iov@alap3.anarazel.de Backpatch-through: 9.6
* Update our mapping of Windows time zone names using CLDR info.Tom Lane2021-10-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This corrects a bunch of entries in win32_tzmap[], and adds a few new ones, based on the CLDR project's windowsZones.xml file. Non-cosmetic changes fall into four main categories: * Flat-out errors: US/Aleutan doesn't exist America/Salvador doesn't exist Asia/Baku is wrong for Yerevan Asia/Dhaka (Bangladesh) is wrong for Astana (Kazakhstan) Europe/Bucharest is wrong for Chisinau America/Mexico_City is wrong for Chetumal America/Buenos_Aires is wrong for Cayenne America/Caracas has its own zone, so poor fit for La Paz US/Eastern is wrong for Haiti US/Eastern is wrong for Indiana (East) Asia/Karachi is wrong for Tashkent Etc/UTC+12 doesn't exist Signs of Etc/GMT zones were backwards * Judgment calls: (These changes follow CLDR's choices, except for the first one) Use Europe/London for "Greenwich Standard Time", since that seems much more likely than Africa/Casablanca to be what people will think that zone name means. CLDR has Atlantic/Reykjavik here, but that's no better. Asia/Shanghai seems a better fit than Hong Kong for "China Standard Time". Europe/Sarajevo is now a link to Belgrade, ie "Central Europe Standard Time"; so use Warsaw for "Central European Standard Time". America/Sao_Paulo seems more representative than Araguaina for "E. South America Standard Time". Africa/Johannesburg seems more representative than Harare for "South Africa Standard Time". * New Windows zone names: "Israel Standard Time" "Kaliningrad Standard Time" "Russia Time Zone N" for various N "Singapore Standard Time" "South Sudan Standard Time" "W. Central Africa Standard Time" "West Bank Standard Time" "Yukon Standard Time" Some of these replace older spellings, but I kept the older spellings too in case our code runs on a machine with the older data. * Replace aliases (tzdb Links) with underlying city-named zones: (This tracks tzdb's longstanding practice, and reduces inconsistency with the rest of the entries, as well as with CLDR.) US/Alaska Asia/Kuwait Asia/Muscat Canada/Atlantic Australia/Canberra Canada/Saskatchewan US/Central US/Eastern US/Hawaii US/Mountain Canada/Newfoundland US/Pacific Back-patch to all supported branches, as is our usual practice for time zone data updates. Discussion: https://postgr.es/m/3266414.1633045628@sss.pgh.pa.us
* Re-alphabetize the win32_tzmap[] array.Tom Lane2021-10-02
| | | | | | | | | | | | | The original intent seems to have been to sort case-insensitively by the Windows zone name, but various changes over the years did not get that memo. This commit just moves a few entries to restore exact alphabetic order, to ease comparison to the outputs of processing scripts. Back-patch to all supported branches, as is our usual practice for time zone data updates. Discussion: https://postgr.es/m/3266414.1633045628@sss.pgh.pa.us
* Error out if SKIP LOCKED and WITH TIES are both specifiedAlvaro Herrera2021-10-01
| | | | | | | | | | | | | | | Both bugs #16676[1] and #17141[2] illustrate that the combination of SKIP LOCKED and FETCH FIRST WITH TIES break expectations when it comes to rows returned to other sessions accessing the same row. Since this situation is detectable from the syntax and hard to fix otherwise, forbid for now, with the potential to fix in the future. [1] https://postgr.es/m/16676-fd62c3c835880da6@postgresql.org [2] https://postgr.es/m/17141-913d78b9675aac8e@postgresql.org Backpatch-through: 13, where WITH TIES was introduced Author: David Christensen <david.christensen@crunchydata.com> Discussion: https://postgr.es/m/CAOxo6XLPccCKru3xPMaYDpa+AXyPeWFs+SskrrL+HKwDjJnLhg@mail.gmail.com
* Avoid believing incomplete MCV-only stats in get_variable_range().Tom Lane2021-10-01
| | | | | | | | | | | | | | | | | | | | | | get_variable_range() would incautiously believe that statistics containing only an MCV list are sufficient to derive a range estimate. That's okay for an enum-like column that contains only MCVs, but otherwise the estimate could be pretty bad. Make it report that the range is indeterminate unless the MCVs plus nullfrac account for the whole table. I don't think this needs a dedicated test case, since a quick code coverage check verifies that the existing regression tests traverse all the alternatives. There is room to doubt that a future-proof test case could be built anyway, given that the submitted example accidentally doesn't fail before v11. Per bug #17207 from Simon Perepelitsa. Back-patch to v10. In principle this has been broken all along, but I'm hesitant to make such changes in 9.6, since if anyone is unhappy with 9.6.24's behavior there will be no second chance to fix it. Discussion: https://postgr.es/m/17207-5265aefa79e333b4@postgresql.org
* Fix Portal snapshot tracking to handle subtransactions properly.Tom Lane2021-10-01
| | | | | | | | | | | | | | | | | | | | | | | Commit 84f5c2908 forgot to consider the possibility that EnsurePortalSnapshotExists could run inside a subtransaction with lifespan shorter than the Portal's. In that case, the new active snapshot would be popped at the end of the subtransaction, leaving a dangling pointer in the Portal, with mayhem ensuing. To fix, make sure the ActiveSnapshot stack entry is marked with the same subtransaction nesting level as the associated Portal. It's certainly safe to do so since we won't be here at all unless the stack is empty; hence we can't create an out-of-order stack. Let's also apply this logic in the case where PortalRunUtility sets portalSnapshot, just to be sure that path can't cause similar problems. It's slightly less clear that that path can't create an out-of-order stack, so add an assertion guarding it. Report and patch by Bertrand Drouvot (with kibitzing by me). Back-patch to v11, like the previous commit. Discussion: https://postgr.es/m/ff82b8c5-77f4-3fe7-6028-fcf3303e82dd@amazon.com
* Remove gratuitous environment dependency in 002_types.pl test.Tom Lane2021-09-30
| | | | | | | | | | | | | | | | | | Computing related timestamps by subtracting "N days" is sensitive to the prevailing timezone, since we interpret that as "same local time on the N'th prior day". Even though the intervals in question are only two to four days, through remarkable bad luck they managed to cross the end of Ramadan in 2014, causing the test's output to change if timezone is set to Africa/Casablanca. (Maybe in other Muslim areas as well; I didn't check.) There's absolutely no reason for this test to exercise interval subtraction, so just get rid of that and use plain timestamptz constants representing the intended values. Per report from Andres Freund. Back-patch to v10 where this test script came in. Discussion: https://postgr.es/m/20210930183641.7lh4jhvpipvromca@alap3.anarazel.de
* Fix WAL replay in presence of an incomplete recordAlvaro Herrera2021-09-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Physical replication always ships WAL segment files to replicas once they are complete. This is a problem if one WAL record is split across a segment boundary and the primary server crashes before writing down the segment with the next portion of the WAL record: WAL writing after crash recovery would happily resume at the point where the broken record started, overwriting that record ... but any standby or backup may have already received a copy of that segment, and they are not rewinding. This causes standbys to stop following the primary after the latter crashes: LOG: invalid contrecord length 7262 at A8/D9FFFBC8 because the standby is still trying to read the continuation record (contrecord) for the original long WAL record, but it is not there and it will never be. A workaround is to stop the replica, delete the WAL file, and restart it -- at which point a fresh copy is brought over from the primary. But that's pretty labor intensive, and I bet many users would just give up and re-clone the standby instead. A fix for this problem was already attempted in commit 515e3d84a0b5, but it only addressed the case for the scenario of WAL archiving, so streaming replication would still be a problem (as well as other things such as taking a filesystem-level backup while the server is down after having crashed), and it had performance scalability problems too; so it had to be reverted. This commit fixes the problem using an approach suggested by Andres Freund, whereby the initial portion(s) of the split-up WAL record are kept, and a special type of WAL record is written where the contrecord was lost, so that WAL replay in the replica knows to skip the broken parts. With this approach, we can continue to stream/archive segment files as soon as they are complete, and replay of the broken records will proceed across the crash point without a hitch. Because a new type of WAL record is added, users should be careful to upgrade standbys first, primaries later. Otherwise they risk the standby being unable to start if the primary happens to write such a record. A new TAP test that exercises this is added, but the portability of it is yet to be seen. This has been wrong since the introduction of physical replication, so backpatch all the way back. In stable branches, keep the new XLogReaderState members at the end of the struct, to avoid an ABI break. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Nathan Bossart <bossartn@amazon.com> Discussion: https://postgr.es/m/202108232252.dh7uxf6oxwcy@alvherre.pgsql
* pgbench: Fix handling of socket errors during benchmark.Fujii Masao2021-09-29
| | | | | | | | | | | | | | Previously socket errors such as invalid socket or socket wait method failures during benchmark caused pgbench to exit with status 0. Instead, errors during the run should result in exit status 2. Back-patch to v12 where pgbench started reporting exit status. Original complaint and patch by Hayato Kuroda. Author: Yugo Nagata, Fabien COELHO Reviewed-by: Kyotaro Horiguchi, Fujii Masao Discussion: https://postgr.es/m/TYCPR01MB5870057375ACA8A73099C649F5349@TYCPR01MB5870.jpnprd01.prod.outlook.com
* pgbench: Correct log level of message output when socket wait method fails.Fujii Masao2021-09-29
| | | | | | | | | | | | The failure of socket wait method like "select()" doesn't terminate pgbench. So the log level of error message when that failure happens should be ERROR. But previously FATAL was used in that case. Back-patch to v13 where pgbench started using common logging API. Author: Yugo Nagata, Fabien COELHO Reviewed-by: Kyotaro Horiguchi, Fujii Masao Discussion: https://postgr.es/m/20210617005934.8bd37bf72efd5f1b38e6f482@sraoss.co.jp
* Split macros from visibilitymap.h into a separate headerAlexander Korotkov2021-09-23
| | | | | | | | | | | That allows to include just visibilitymapdefs.h from file.c, and in turn, remove include of postgres.h from relcache.h. Reported-by: Andres Freund Discussion: https://postgr.es/m/20210913232614.czafiubr435l6egi%40alap3.anarazel.de Author: Alexander Korotkov Reviewed-by: Andres Freund, Tom Lane, Alvaro Herrera Backpatch-through: 13
* Release memory allocated by dependency_degreeTomas Vondra2021-09-23
| | | | | | | | | | | | | | | | | Calculating degree of a functional dependency may allocate a lot of memory - we have released mot of the explicitly allocated memory, but e.g. detoasted varlena values were left behind. That may be an issue, because we consider a lot of dependencies (all combinations), and the detoasting may happen for each one again. Fixed by calling dependency_degree() in a dedicated context, and resetting it after each call. We only need the calculated dependency degree, so we don't need to copy anything. Backpatch to PostgreSQL 10, where extended statistics were introduced. Backpatch-through: 10 Discussion: https://www.postgresql.org/message-id/20210915200928.GP831%40telsasoft.com
* Free memory after building each statistics objectTomas Vondra2021-09-23
| | | | | | | | | | | | | | | | | | | | | | | | Until now, all extended statistics on a given relation were built in the same memory context, without resetting. Some of the memory was released explicitly, but not all of it - for example memory allocated while detoasting values is hard to free. This is how it worked since extended statistics were introduced in PostgreSQL 10, but adding support for extended stats on expressions made the issue somewhat worse as it increases the number of statistics to build. Fixed by adding a memory context which gets reset after building each statistics object (all the statistics kinds included in it). Resetting it after building each statistics kind would be even better, but it would require more invasive changes and copying of results, making it harder to backpatch. Backpatch to PostgreSQL 10, where extended statistics were introduced. Author: Justin Pryzby Reported-by: Justin Pryzby Reviewed-by: Tomas Vondra Backpatch-through: 10 Discussion: https://www.postgresql.org/message-id/20210915200928.GP831%40telsasoft.com
* Invalidate all partitions for a partitioned table in publication.Amit Kapila2021-09-22
| | | | | | | | | | | | | | | | | Updates/Deletes on a partition were allowed even without replica identity after the parent table was added to a publication. This would later lead to an error on subscribers. The reason was that we were not invalidating the partition's relcache and the publication information for partitions was not getting rebuilt. Similarly, we were not invalidating the partitions' relcache after dropping a partitioned table from a publication which will prohibit Updates/Deletes on its partition without replica identity even without any publication. Reported-by: Haiying Tang Author: Hou Zhijie and Vignesh C Reviewed-by: Vignesh C and Amit Kapila Backpatch-through: 13 Discussion: https://postgr.es/m/OS0PR01MB6113D77F583C922F1CEAA1C3FBD29@OS0PR01MB6113.jpnprd01.prod.outlook.com
* Fix places in TestLib.pm in need of adaptation to the output of Msys perlMichael Paquier2021-09-22
| | | | | | | | | | | | | | | | | | | Contrary to the output of native perl, Msys perl generates outputs with CRLFs characters. There are already places in the TAP code where CRLFs (\r\n) are automatically converted to LF (\n) on Msys, but we missed a couple of places when running commands and using their output for comparison, that would lead to failures. This problem has been found thanks to the test added in 5adb067 using TestLib::command_checks_all(), but after a closer look more code paths were missing a filter. This is backpatched all the way down to prevent any surprises if a new test is introduced in stable branches. Reviewed-by: Andrew Dunstan, Álvaro Herrera Discussion: https://postgr.es/m/1252480.1631829409@sss.pgh.pa.us Backpatch-through: 9.6
* Fix misevaluation of STABLE parameters in CALL within plpgsql.Tom Lane2021-09-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Before commit 84f5c2908, a STABLE function in a plpgsql CALL statement's argument list would see an up-to-date snapshot, because exec_stmt_call would push a new snapshot. I got rid of that because the possibility of the snapshot disappearing within COMMIT made it too hard to manage a snapshot across the CALL statement. That's fine so far as the procedure itself goes, but I forgot to think about the possibility of STABLE functions within the CALL argument list. As things now stand, those'll be executed with the Portal's snapshot as ActiveSnapshot, keeping them from seeing updates more recent than Portal startup. (VOLATILE functions don't have a problem because they take their own snapshots; which indeed is also why the procedure itself doesn't have a problem. There are no STABLE procedures.) We can fix this by pushing a new snapshot transiently within ExecuteCallStmt itself. Popping the snapshot before we get into the procedure proper eliminates the management problem. The possibly-useless extra snapshot-grab is slightly annoying, but it's no worse than what happened before 84f5c2908. Per bug #17199 from Alexander Nawratil. Back-patch to v11, like the previous patch. Discussion: https://postgr.es/m/17199-1ab2561f0d94af92@postgresql.org
* Remove overzealous index deletion assertion.Peter Geoghegan2021-09-20
| | | | | | | | | | | | | | | | | | | | A broken HOT chain is not an unexpected condition, even when the offset number points past the end of the page's line pointer array. heap_prune_chain() does not (and never has) treated this condition as unexpected, so derivative code in heap_index_delete_tuples() shouldn't do so either. Oversight in commit 4228817449. The assertion can probably only fail on Postgres 14 and master. Earlier releases don't have commit 3c3b8a4b, which taught VACUUM to truncate the line pointer array of heap pages. Backpatch all the same, just to be consistent. Author: Peter Geoghegan <pg@bowt.ie> Reported-By: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/17197-9438f31f46705182@postgresql.org Backpatch: 12-, just like commit 4228817449.
* Don't elide casting to typmod -1.Tom Lane2021-09-20
| | | | | | | | | | | | | | | | | | | | | | Casting a value that's already of a type with a specific typmod to an unspecified typmod doesn't do anything so far as run-time behavior is concerned. However, it really ought to change the exposed type of the expression to match. Up to now, coerce_type_typmod hasn't bothered with that, which creates gotchas in contexts such as recursive unions. If for example one side of the union is numeric(18,3), but it needs to be plain numeric to match the other side, there's no direct way to express that. This is easy enough to fix, by inserting a RelabelType to update the exposed type of the expression. However, it's a bit nervous-making to change this behavior, because it's stood for a really long time. But no complaints have emerged about 14beta3, so go ahead and back-patch. Back-patch of 5c056b0c2 into previous supported branches. Discussion: https://postgr.es/m/CABNQVagu3bZGqiTjb31a8D5Od3fUMs7Oh3gmZMQZVHZ=uWWWfQ@mail.gmail.com Discussion: https://postgr.es/m/1488389.1631984807@sss.pgh.pa.us
* Fix pull_varnos to cope with translated PlaceHolderVars.Tom Lane2021-09-17
| | | | | | | | | | | | | | | | | | Commit 55dc86eca changed pull_varnos to use (if possible) the associated ph_eval_at for a PlaceHolderVar. I missed a fine point though: we might be looking at a PHV in the quals or tlist of a child appendrel, in which case we need to compute a ph_eval_at value that's been translated in the same way that the PHV itself has been (cf. adjust_appendrel_attrs). Fortunately, enough info is available in the PlaceHolderInfo to make such translation possible without additional outside data, so we don't need another round of uglification of planner APIs. This is a little bit complicated, but since it's a hard-to-hit corner case, I'm not much worried about adding cycles here. Per report from Jaime Casanova. Back-patch to v12, like the previous commit. Discussion: https://postgr.es/m/20210915230959.GB17635@ahch-to
* Fix variable shadowing in procarray.c.Fujii Masao2021-09-16
| | | | | | | | | | | | | ProcArrayGroupClearXid function has a parameter named "proc", but the same name was used for its local variables. This commit fixes this variable shadowing, to improve code readability. Back-patch to all supported versions, to make future back-patching easy though this patch is classified as refactoring only. Reported-by: Ranier Vilela Author: Ranier Vilela, Aleksander Alekseev https://postgr.es/m/CAEudQAqyoTZC670xWi6w-Oe2_Bk1bfu2JzXz6xRfiOUzm7xbyQ@mail.gmail.com
* Disallow LISTEN in background workers.Tom Lane2021-09-15
| | | | | | | | | | | | | | | | | | | It's possible to execute user-defined SQL in some background processes; for example, logical replication workers can fire triggers. This opens the possibility that someone would try to execute LISTEN in such a context. But since only regular backends ever call ProcessNotifyInterrupt, no messages would actually be received, and thus the registered listener would simply prevent the message queue from being cleaned. Eventually NOTIFY would stop working, which is bad. Perhaps someday somebody will invent infrastructure to make listening in a background worker actually useful. In the meantime, forbid it. Back-patch to v13, which is where we introduced the MyBackendType variable. It'd be a lot harder to implement the check without that, and it doesn't seem worth the trouble. Discussion: https://postgr.es/m/153243441449.1404.2274116228506175596@wrigleys.postgresql.org
* Send NOTIFY signals during CommitTransaction.Tom Lane2021-09-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Formerly, we sent signals for outgoing NOTIFY messages within ProcessCompletedNotifies, which was also responsible for sending relevant ones of those messages to our connected client. It therefore had to run during the main-loop processing that occurs just before going idle. This arrangement had two big disadvantages: * Now that procedures allow intra-command COMMITs, it would be useful to send NOTIFYs to other sessions immediately at COMMIT (though, for reasons of wire-protocol stability, we still shouldn't forward them to our client until end of command). * Background processes such as replication workers would not send NOTIFYs at all, since they never execute the client communication loop. We've had requests to allow triggers running in replication workers to send NOTIFYs, so that's a problem. To fix these things, move transmission of outgoing NOTIFY signals into AtCommit_Notify, where it will happen during CommitTransaction. Also move the possible call of asyncQueueAdvanceTail there, to ensure we don't bloat the async SLRU if a background worker sends many NOTIFYs with no one listening. We can also drop the call of asyncQueueReadAllNotifications, allowing ProcessCompletedNotifies to go away entirely. That's because commit 790026972 added a call of ProcessNotifyInterrupt adjacent to PostgresMain's call of ProcessCompletedNotifies, and that does its own call of asyncQueueReadAllNotifications, meaning that we were uselessly doing two such calls (inside two separate transactions) whenever inbound notify signals coincided with an outbound notify. We need only set notifyInterruptPending to ensure that ProcessNotifyInterrupt runs, and we're done. The existing documentation suggests that custom background workers should call ProcessCompletedNotifies if they want to send NOTIFY messages. To avoid an ABI break in the back branches, reduce it to an empty routine rather than removing it entirely. Removal will occur in v15. Although the problems mentioned above have existed for awhile, I don't feel comfortable back-patching this any further than v13. There was quite a bit of churn in adjacent code between 12 and 13. At minimum we'd have to also backpatch 51004c717, and a good deal of other adjustment would also be needed, so the benefit-to-risk ratio doesn't look attractive. Per bug #15293 from Michael Powers (and similar gripes from others). Artur Zakirov and Tom Lane Discussion: https://postgr.es/m/153243441449.1404.2274116228506175596@wrigleys.postgresql.org
* jit: Do not try to shut down LLVM state in case of LLVM triggered errors.Andres Freund2021-09-13
| | | | | | | | | | | | | | | | | If an allocation failed within LLVM it is not safe to call back into LLVM as LLVM is not generally safe against exceptions / stack-unwinding. Thus errors while in LLVM code are promoted to FATAL. However llvm_shutdown() did call back into LLVM even in such cases, while llvm_release_context() was careful not to do so. We cannot generally skip shutting down LLVM, as that can break profiling. But it's OK to do so if there was an error from within LLVM. Reported-By: Jelte Fennema <Jelte.Fennema@microsoft.com> Author: Andres Freund <andres@anarazel.de> Author: Justin Pryzby <pryzby@telsasoft.com> Discussion: https://postgr.es/m/AM5PR83MB0178C52CCA0A8DEA0207DC14F7FF9@AM5PR83MB0178.EURPRD83.prod.outlook.com Backpatch: 11-, where jit was introduced
* Fix EXIT out of outermost block in plpgsql.Tom Lane2021-09-13
| | | | | | | | | | | | Ordinarily, using EXIT this way would draw "control reached end of function without RETURN". However, if the function is one where we don't require an explicit RETURN (such as a DO block), that should not happen. It did anyway, because add_dummy_return() neglected to account for the case. Per report from Herwig Goemans. Back-patch to all supported branches. Discussion: https://postgr.es/m/868ae948-e3ca-c7ec-95a6-83cfc08ef750@gmail.com
* Fix reorder buffer memory accounting for toast changes.Amit Kapila2021-09-13
| | | | | | | | | | | | | | | | | While processing toast changes in logical decoding, we rejigger the tuple change to point to in-memory toast tuples instead to on-disk toast tuples. And, to make sure the memory accounting is correct, we were subtracting the old change size and then after re-computing the new tuple, re-adding its size at the end. Now, if there is any error before we add the new size, we will release the changes and that will update the accounting info (subtracting the size from the counters). And we were underflowing there which leads to an assertion failure in assert enabled builds and wrong memory accounting in reorder buffer otherwise. Author: Bertrand Drouvot Reviewed-by: Amit Kapila Backpatch-through: 13, where memory accounting was introduced Discussion: https://postgr.es/m/92b0ee65-b8bd-e42d-c082-4f3f4bf12d34@amazon.com
* Fix error handling with threads on OOM in ECPG connection logicMichael Paquier2021-09-13
| | | | | | | | | | | | | | | | | | | | An out-of-memory failure happening when allocating the structures to store the connection parameter keywords and values would mess up with the set of connections saved, as on failure the pthread mutex would still be hold with the new connection object listed but free()'d. Rather than just unlocking the mutex, which would leave the static list of connections into an inconsistent state, move the allocation for the structures of the connection parameters before beginning the test manipulation. This ensures that the list of connections and the connection mutex remain consistent all the time in this code path. This error is unlikely going to happen, but this could mess up badly with ECPG clients in surprising ways, so backpatch all the way down. Reported-by: ryancaicse Discussion: https://postgr.es/m/17186-b4cfd8f0eb4d1dee@postgresql.org Backpatch-through: 9.6
* Make pg_regexec() robust against out-of-range search_start.Tom Lane2021-09-11
| | | | | | | | | | | | | | | | | | If search_start is greater than the length of the string, we should just return REG_NOMATCH immediately. (Note that the equality case should *not* be rejected, since the pattern might be able to match zero characters.) This guards various internal assumptions that the min of a range of string positions is not more than the max. Violation of those assumptions could allow an attempt to fetch string[search_start-1], possibly causing a crash. Jaime Casanova pointed out that this situation is reachable with the new regexp_xxx functions that accept a user-specified start position. I don't believe it's reachable via any in-core call site in v14 and below. However, extensions could possibly call pg_regexec with an out-of-range search_start, so let's back-patch the fix anyway. Discussion: https://postgr.es/m/20210911180357.GA6870@ahch-to
* Fix some anomalies with NO SCROLL cursors.Tom Lane2021-09-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have long forbidden fetching backwards from a NO SCROLL cursor, but the prohibition didn't extend to cases in which we rewind the query altogether and then re-fetch forwards. I think the reason is that this logic was mainly meant to protect plan nodes that can't be run in the reverse direction. However, re-reading the query output is problematic if the query is volatile (which includes SELECT FOR UPDATE, not just queries with volatile functions): the re-read can produce different results, which confuses the cursor navigation logic completely. Another reason for disliking this approach is that some code paths will either fetch backwards or rewind-and-fetch-forwards depending on the distance to the target row; so that seemingly identical use-cases may or may not draw the "cursor can only scan forward" error. Hence, let's clean things up by disallowing rewind as well as fetch-backwards in a NO SCROLL cursor. Ordinarily we'd only make such a definitional change in HEAD, but there is a third reason to consider this change now. Commit ba2c6d6ce created some new user-visible anomalies for non-scrollable cursors WITH HOLD, in that navigation in the cursor result got confused if the cursor had been partially read before committing. The only good way to resolve those anomalies is to forbid rewinding such a cursor, which allows removal of the incorrect cursor state manipulations that ba2c6d6ce added to PersistHoldablePortal. To minimize the behavioral change in the back branches (including v14), refuse to rewind a NO SCROLL cursor only when it has a holdStore, ie has been held over from a previous transaction due to WITH HOLD. This should avoid breaking most applications that have been sloppy about whether to declare cursors as scrollable. We'll enforce the prohibition across-the-board beginning in v15. Back-patch to v11, as ba2c6d6ce was. Discussion: https://postgr.es/m/3712911.1631207435@sss.pgh.pa.us
* Avoid fetching from an already-terminated plan.Tom Lane2021-09-09
| | | | | | | | | | | | | | | Some plan node types don't react well to being called again after they've already returned NULL. PortalRunSelect() has long dealt with this by calling the executor with NoMovementScanDirection if it sees that we've already run the portal to the end. However, commit ba2c6d6ce overlooked this point, so that persisting an already-fully-fetched cursor would fail if it had such a plan. Per report from Tomas Barton. Back-patch to v11, as the faulty commit was. (I've omitted a test case because the type of plan that causes a problem isn't all that stable.) Discussion: https://postgr.es/m/CAPV2KRjd=ErgVGbvO2Ty20tKTEZZr6cYsYLxgN_W3eAo9pf5sw@mail.gmail.com
* Check for relation length overrun soon enough.Tom Lane2021-09-09
| | | | | | | | | | | | | | | | | We don't allow relations to exceed 2^32-1 blocks, because block numbers are 32 bits and the last possible block number is reserved to mean InvalidBlockNumber. There is a check for this in mdextend, but that's really way too late, because the smgr API requires us to create a buffer for the block-to-be-added, and we do not want to have any buffer with blocknum InvalidBlockNumber. (Such a case can trigger assertions in bufmgr.c, plus I think it might confuse ReadBuffer's logic for data-past-EOF later on.) So put the check into ReadBuffer. Per report from Christoph Berg. It's been like this forever, so back-patch to all supported branches. Discussion: https://postgr.es/m/YTn1iTkUYBZfcODk@msg.credativ.de
* Fix issue with WAL archiving in standby.Fujii Masao2021-09-09
| | | | | | | | | | | | | | | | | | | | | | | Previously, walreceiver always closed the currently-opened WAL segment and created its archive notification file, after it finished writing the current segment up and received any WAL data that should be written into the next segment. If walreceiver exited just before any WAL data in the next segment arrived at standby, it did not create the archive notification file of the current segment even though that's known completed. This behavior could cause WAL archiving of the segment to be delayed until subsequent restartpoints or checkpoints created its notification file. To fix the issue, this commit changes walreceiver so that it creates an archive notification file of a current WAL segment immediately if that's known completed before receiving next WAL data. Back-patch to all supported branches. Reported-by: Kyotaro Horiguchi Author: Fujii Masao Reviewed-by: Kyotaro Horiguchi Discussion: https://postgr.es/m/20200630.165503.1465894182551545886.horikyota.ntt@gmail.com