aboutsummaryrefslogtreecommitdiff
path: root/src/backend
Commit message (Collapse)AuthorAge
* Make autovacuum launcher more responsive to pg_log_backend_memory_contexts().Fujii Masao2021-10-12
| | | | | | | | | | | | | | | | Previously when pg_log_backend_memory_contexts() sent the request to the autovacuum launcher, it could take more than several seconds to log its memory contexts. Because the function (HandleAutoVacLauncherInterrupts) to process any new interrupts that autovacuum launcher received didn't handle the request for logging of memory contexts. This commit changes the function so that it handles the request, to make autovacuum launcher more responsitve to pg_log_backend_memory_contexts(). Back-patch to v14 where pg_log_backend_memory_contexts() was added. Author: Koyu Tanigawa Reviewed-by: Bharath Rupireddy, Atsushi Torikoshi Discussion: https://postgr.es/m/0aae3e074face409b35153451be5cc11@oss.nttdata.com
* Fix EXPLAIN of SEARCH BREADTH FIRST queries some more.Tom Lane2021-10-11
| | | | | | | | | | | | | | | | | | | | Commit 3f50b8263 had an oversight: formerly, to deparse expressions attached to a plan node, it was only necessary to update the deparse_namespace ancestors list alongside calling set_deparse_plan. Now it's necessary to update the ancestors list *first*, because set_deparse_plan consults it, and one call site got that wrong. This error was masked in most cases because explain.c uses just one List object for the ancestors list, updating it in-place as the plan is scanned, so that we accidentally had the right List assigned to dpns->ancestors before it was needed. It would fail only if a WorkTableScan node were the first one that we tried to deparse a subexpression of. Per report from Markus Winand. Like the previous patch, back-patch to v14. Discussion: https://postgr.es/m/648B0505-AA57-42C2-A2DA-E551DE46FA15@winand.at
* Clean up more code using "(expr) ? true : false"Michael Paquier2021-10-11
| | | | | | | | | This is similar to fd0625c, taking care of any remaining code paths that are worth the cleanup. This also changes some cases using opposite expression patterns. Author: Justin Pryzby, Masahiko Sawada Discussion: https://postgr.es/m/CAD21AoCdF8dnUvr-BUWWGvA_XhKSoANacBMZb6jKyCk4TYfQ2Q@mail.gmail.com
* Refactor fallback to stderr for csvlog to handle better WIN32 service caseMichael Paquier2021-10-08
| | | | | | | | | | | | | | | | | | | | | | | | send_message_to_server_log() would force a redirection of a log entry to stderr in some cases for csvlog, like the syslogger not being available yet. If this happens, csvlog would fall back to stderr to log some information rather than nothing. The code was organized so as stderr is done before csvlog, with csvlog checking that stderr did not happen yet with a reversed condition. With this code organization, it could be possible to lose some messages if running Postgres as a service on WIN32, as there is no usable stderr, and the handling of the StringInfoData holding the message for stderr was rather confusing because of that. This commit moves the csvlog handling to be before stderr, as as we are able to track down if it is necessary to log something to stderr. The reduces the handling of stderr to be in a single code path, adding a fallback to event logs for a WIN32 service. This also simplifies the way we handle the StringInfoData for stderr, making easier the integration of new file-based log destinations. I got to play with services and event logs on Windows while checking this change. Reviewed-by: Chris Bandy Discussion: https://postgr.es/m/YV0vwBovEKf1WXkl@paquier.xyz
* Add missing word to comment in joinrels.c.Etsuro Fujita2021-10-07
| | | | | | Author: Amit Langote Backpatch-through: 13 Discussion: https://postgr.es/m/CA%2BHiwqGQNbtamQ_9DU3osR1XiWR4wxWFZurPmN6zgbdSZDeWmw%40mail.gmail.com
* Fix compilation warning in syslogger.cMichael Paquier2021-10-07
| | | | | | | Oversight in 5c6e33f. Author: Nathan Bossart Discussion: https://postgr.es/m/DD8AD4CE-63B7-44BE-A3D2-14A4E4B19C26@amazon.com
* Improve order in filePeter Eisentraut2021-10-07
| | | | | Move support functions for new PublicationTable node to more sensible locations in the files.
* Refactor per-destination file rotation in logging collectorMichael Paquier2021-10-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | stderr and csvlog have been using duplicated code when it came to the rotation of their file by size, age or if forced by a user request (pg_ctl logrotate or the SQL function pg_rotate_logfile). The main difference between both is that stderr requires its file to always be opened, so as it is possible to have a redirection route if the logging collector is not ready yet to do its work if alternate destinations are enabled. Also, if csvlog gets disabled, we need to close properly its meta-data stored in the logging collector (last file name for current_logfiles and fd currently open for business). Except for those points, the code is the same in terms of error handling and if a file should be created or just continued. This change makes the code simpler overall, and it will help in the introduction of more file-based log destinations. This refactoring is similar to the work done in 5b0b699. Most of the duplication originates from fd801f4. Some of the TAP tests of pg_ctl check the case of a forced log rotation, but this is somewhat limited as there is no coverage for log_rotation_age or log_rotation_size (these may not be worth the extra resources to run either), and no coverage for reload of log_destination with different combinations of stderr and csvlog. I have tested all those cases separately for this refactoring. Author: Michael Paquier Discussion: https://postgr.es/m/CAH7T-aqswBM6JWe4pDehi1uOiufqe06DJWaU5=X7dDLyqUExHg@mail.gmail.com
* Fix corner-case loss of precision in numeric_power().Dean Rasheed2021-10-06
| | | | | | | | | | | | | | | | | | | | | | | | This fixes a loss of precision that occurs when the first input is very close to 1, so that its logarithm is very small. Formerly, during the initial low-precision calculation to estimate the result weight, the logarithm was computed to a local rscale that was capped to NUMERIC_MAX_DISPLAY_SCALE (1000). However, the base may be as close as 1e-16383 to 1, hence its logarithm may be as small as 1e-16383, and so the local rscale needs to be allowed to exceed 16383, otherwise all precision is lost, leading to a poor choice of rscale for the full-precision calculation. Fix this by removing the cap on the local rscale during the initial low-precision calculation, as we already do in the full-precision calculation. This doesn't change the fact that the initial calculation is a low-precision approximation, computing the logarithm to around 8 significant digits, which is very fast, especially when the base is very close to 1. Patch by me, reviewed by Alvaro Herrera. Discussion: https://postgr.es/m/CAEZATCV-Ceu%2BHpRMf416yUe4KKFv%3DtdgXQAe5-7S9tD%3D5E-T1g%40mail.gmail.com
* Flexible options for CREATE_REPLICATION_SLOT.Robert Haas2021-10-05
| | | | | | | | | | | | | | | | | | | | Like BASE_BACKUP, CREATE_REPLICATION_SLOT has historically used a hard-coded syntax. To improve future extensibility, adopt a flexible options syntax here, too. In the new syntax, instead of three mutually exclusive options EXPORT_SNAPSHOT, USE_SNAPSHOT, and NOEXPORT_SNAPSHOT, there is now a single SNAPSHOT option with three possible values: 'export', 'use', and 'nothing'. This commit does not remove support for the old syntax. It just adds the new one as an additional option, makes pg_receivewal, pg_recvlogical, and walreceiver processes use it. Patch by me, reviewed by Fabien Coelho, Sergei Kornilov, and Fujii Masao. Discussion: http://postgr.es/m/CA+TgmobAczXDRO_Gr2euo_TxgzaH1JxbNxvFx=HYvBinefNH8Q@mail.gmail.com Discussion: http://postgr.es/m/CA+TgmoZGwR=ZVWFeecncubEyPdwghnvfkkdBe9BLccLSiqdf9Q@mail.gmail.com
* Flexible options for BASE_BACKUP.Robert Haas2021-10-05
| | | | | | | | | | | | | | | | | | | | | | | | Previously, BASE_BACKUP used an entirely hard-coded syntax, but that's hard to extend. Instead, adopt the same kind of syntax we've used for SQL commands such as VACUUM, ANALYZE, COPY, and EXPLAIN, where it's not necessary for all of the option names to be parser keywords. In the new syntax, most of the options now take an optional Boolean argument. To match our practice in other in places, the options which the old syntax called NOWAIT and NOVERIFY_CHECKSUMS options are in the new syntax called WAIT and VERIFY_CHECKUMS, and the default value is false. In the new syntax, the FAST option has been replaced by a CHECKSUM option whose value may be 'fast' or 'spread'. This commit does not remove support for the old syntax. It just adds the new one as an additional option, and makes pg_basebackup prefer the new syntax when the server is new enough to support it. Patch by me, reviewed and tested by Fabien Coelho, Sergei Kornilov, Fujii Masao, and Tushar Ahuja. Discussion: http://postgr.es/m/CA+TgmobAczXDRO_Gr2euo_TxgzaH1JxbNxvFx=HYvBinefNH8Q@mail.gmail.com Discussion: http://postgr.es/m/CA+TgmoZGwR=ZVWFeecncubEyPdwghnvfkkdBe9BLccLSiqdf9Q@mail.gmail.com
* Make recovery report error message when invalid page header is found.Fujii Masao2021-10-06
| | | | | | | | | | | | | | | | | | | | | | | | Commit 0668719801 changed XLogPageRead() so that it validated the page header, if invalid page header was found reset the error message and retried reading the page, to fix the scenario where streaming standby got stuck at a continuation record. This change hid the error message about invalid page header, which would make it harder for users to investigate what the actual issue was found in WAL. To fix the issue, this commit makes XLogPageRead() report the error message when invalid page header is found. When not in standby mode, an invalid page header should cause recovery to end, not retry reading the page, so XLogPageRead() doesn't need to validate the page header for the retry. Instead, ReadPageInternal() should be responsible for the validation in that case. Therefore this commit changes XLogPageRead() so that if not in standby mode it doesn't validate the page header for the retry. Reported-by: Yugo Nagata Author: Yugo Nagata, Kyotaro Horiguchi Reviewed-by: Ranier Vilela, Fujii Masao Discussion: https://postgr.es/m/20210718045505.32f463ed6c227111038d8ae4@sraoss.co.jp
* Remove obsolete comment in snapbuild.c.Amit Kapila2021-10-05
| | | | | | | | | Commits 955a684e04 and a975ff4980 removed the usage of running xacts information from serialized snapshots but forgot to remove the corresponding comment. Author: Masahiko Sawada Discussion: https://postgr.es/m/CAD21AoBifOr7RS=jRe7YCavc646y9omChv6zkWXvJeZcjS9mXA@mail.gmail.com
* Make Unicode makefile parallel-safePeter Eisentraut2021-10-04
| | | | | | | | | Fix the rules so that each rule is parallel safe, using the same trickery that we use elsewhere in the tree for rules that produce more than one output file. Refactor the whole makefile so that there is less repetition. Discussion: https://www.postgresql.org/message-id/18e34084-aab1-1b4c-edd1-c4f9fb04f714%40enterprisedb.com
* Fix duplicate words in commentsDaniel Gustafsson2021-10-04
| | | | | | | Remove accidentally duplicated words in code comments. Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://postgr.es/m/87bl45t0co.fsf@wibble.ilmari.org
* Update Unicode map text filesPeter Eisentraut2021-10-04
| | | | | | | A couple of newer ones are available. There are no functional differences, but let's get them in anyway, so that there is no surprise diff next time someone wants to do some actual work in this area.
* Replace occurrences of InvalidXid with InvalidTransactionIdDaniel Gustafsson2021-10-04
| | | | | | | | | While Xid is a known shortening of TransactionId, InvalidXid is not defined in the code. Fix comments which mistakenly were using the shorter version. Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Discussion: https://postgr.es/m/CALj2ACUQzdigML868nV4cojfELPkEzNLNOk7b91Pho4JB90fng@mail.gmail.com
* Fix snapshot builds during promotion of hot standby node with 2PCMichael Paquier2021-10-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some specific logic is done at the end of recovery when involving 2PC transactions: 1) Call RecoverPreparedTransactions(), to recover the state of 2PC transactions into memory (re-acquire locks, etc.). 2) ShutdownRecoveryTransactionEnvironment(), to move back to normal operations, mainly cleaning up recovery locks and KnownAssignedXids (including any 2PC transaction tracked previously). 3) Switch XLogCtl->SharedRecoveryState to RECOVERY_STATE_DONE, which is the tipping point for any process calling RecoveryInProgress() to check if the cluster is still in recovery or not. Any snapshot taken between steps 2) and 3) would be empty, causing any transaction relying on a snapshot at this point to potentially corrupt data as there could still be some 2PC transactions to track, with RecentXmin moving backwards on successive calls to GetSnapshotData() in the same transaction. As SharedRecoveryState is the point to take into account to know if it is safe to discard KnownAssignedXids, this commit moves step 2) after step 3), so as we can never finish with empty snapshots. This exists since the introduction of hot standby, so backpatch all the way down. The window with incorrect snapshots is extremely small, but I have seen it when running 023_pitr_prepared_xact.pl, as did buildfarm member fairywren. Thomas Munro also found it independently. Special thanks to Andres Freund for taking the time to analyze this issue. Reported-by: Thomas Munro, Michael Paquier Analyzed-by: Andres Freund Discussion: https://postgr.es/m/20210422203603.fdnh3fu2mmfp2iov@alap3.anarazel.de Backpatch-through: 9.6
* Fix checking of query type in plpgsql's RETURN QUERY command.Tom Lane2021-10-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prior to v14, we insisted that the query in RETURN QUERY be of a type that returns tuples. (For instance, INSERT RETURNING was allowed, but not plain INSERT.) That happened indirectly because we opened a cursor for the query, so spi.c checked SPI_is_cursor_plan(). As a consequence, the error message wasn't terribly on-point, but at least it was there. Commit 2f48ede08 lost this detail. Instead, plain RETURN QUERY insisted that the query be a SELECT (by checking for SPI_OK_SELECT) while RETURN QUERY EXECUTE failed to check the query type at all. Neither of these changes was intended. The only convenient place to check this in the EXECUTE case is inside _SPI_execute_plan, because we haven't done parse analysis until then. So we need to pass down a flag saying whether to enforce that the query returns tuples. Fortunately, we can squeeze another boolean into struct SPIExecuteOptions without an ABI break, since there's padding space there. (It's unlikely that any extensions would already be using this new struct, but preserving ABI in v14 seems like a smart idea anyway.) Within spi.c, it seemed like _SPI_execute_plan's parameter list was already ridiculously long, and I didn't want to make it longer. So I thought of passing SPIExecuteOptions down as-is, allowing that parameter list to become much shorter. This makes the patch a bit more invasive than it might otherwise be, but it's all internal to spi.c, so that seems fine. Per report from Marc Bachmann. Back-patch to v14 where the faulty code came in. Discussion: https://postgr.es/m/1F2F75F0-27DF-406F-848D-8B50C7EEF06A@gmail.com
* Enable deduplication in system catalog indexes.Peter Geoghegan2021-10-02
| | | | | | | | | | | | | | | The "equality implies image equality" opclass infrastructure disallowed deduplication in system catalog indexes and TOAST indexes before now. That seemed like the right approach back when the infrastructure was added by commit 612a1ab7, since ALTER INDEX cannot set deduplicate_items to 'off' (due to an old implementation restriction). But that decision now seems arbitrary at best. Remove special case handling implementing this policy. No catversion bump, since existing catalog indexes will still work. Author: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/CAH2-Wz=rYQHFaJ3WYBdK=xgwxKzaiGMSSrh-ZCREa-pS-7Zjew@mail.gmail.com
* Error out if SKIP LOCKED and WITH TIES are both specifiedAlvaro Herrera2021-10-01
| | | | | | | | | | | | | | | Both bugs #16676[1] and #17141[2] illustrate that the combination of SKIP LOCKED and FETCH FIRST WITH TIES break expectations when it comes to rows returned to other sessions accessing the same row. Since this situation is detectable from the syntax and hard to fix otherwise, forbid for now, with the potential to fix in the future. [1] https://postgr.es/m/16676-fd62c3c835880da6@postgresql.org [2] https://postgr.es/m/17141-913d78b9675aac8e@postgresql.org Backpatch-through: 13, where WITH TIES was introduced Author: David Christensen <david.christensen@crunchydata.com> Discussion: https://postgr.es/m/CAOxo6XLPccCKru3xPMaYDpa+AXyPeWFs+SskrrL+HKwDjJnLhg@mail.gmail.com
* Remove unstable, unnecessary test; fix typoAlvaro Herrera2021-10-01
| | | | | | | | | | | | | Commit ff9f111bce24 added some test code that's unportable and doesn't add meaningful coverage. Remove it rather than try and get it to work everywhere. While at it, fix a typo in a log message added by the aforementioned commit. Backpatch to 14. Discussion: https://postgr.es/m/3000074.1632947632@sss.pgh.pa.us
* Avoid believing incomplete MCV-only stats in get_variable_range().Tom Lane2021-10-01
| | | | | | | | | | | | | | | | | | | | | | get_variable_range() would incautiously believe that statistics containing only an MCV list are sufficient to derive a range estimate. That's okay for an enum-like column that contains only MCVs, but otherwise the estimate could be pretty bad. Make it report that the range is indeterminate unless the MCVs plus nullfrac account for the whole table. I don't think this needs a dedicated test case, since a quick code coverage check verifies that the existing regression tests traverse all the alternatives. There is room to doubt that a future-proof test case could be built anyway, given that the submitted example accidentally doesn't fail before v11. Per bug #17207 from Simon Perepelitsa. Back-patch to v10. In principle this has been broken all along, but I'm hesitant to make such changes in 9.6, since if anyone is unhappy with 9.6.24's behavior there will be no second chance to fix it. Discussion: https://postgr.es/m/17207-5265aefa79e333b4@postgresql.org
* Fix Portal snapshot tracking to handle subtransactions properly.Tom Lane2021-10-01
| | | | | | | | | | | | | | | | | | | | | | | Commit 84f5c2908 forgot to consider the possibility that EnsurePortalSnapshotExists could run inside a subtransaction with lifespan shorter than the Portal's. In that case, the new active snapshot would be popped at the end of the subtransaction, leaving a dangling pointer in the Portal, with mayhem ensuing. To fix, make sure the ActiveSnapshot stack entry is marked with the same subtransaction nesting level as the associated Portal. It's certainly safe to do so since we won't be here at all unless the stack is empty; hence we can't create an out-of-order stack. Let's also apply this logic in the case where PortalRunUtility sets portalSnapshot, just to be sure that path can't cause similar problems. It's slightly less clear that that path can't create an out-of-order stack, so add an assertion guarding it. Report and patch by Bertrand Drouvot (with kibitzing by me). Back-patch to v11, like the previous commit. Discussion: https://postgr.es/m/ff82b8c5-77f4-3fe7-6028-fcf3303e82dd@amazon.com
* Ensure interleaved_parts field is always initializedDavid Rowley2021-10-01
| | | | | | | | | | | | | | | This field was recently added in db632fbca, however that commit missed one place where it should have initialized the new field to NULL. The missed location is where the PartitionBoundInfo is created for partition-wise join relations. Technically there could be interleaved partitions in a partition-wise join relation, but currently the only optimization we use this field for only does so for base rels and other member rels. So just document that we don't populate this field for join rels. Reported-by: Amit Langote Author: Amit Langote, David Rowley Reviewed-by: Amit Langote, David Rowley Discussion: https://postgr.es/m/CA+HiwqE76Rps24kwHsd2Cr82Ua07tJC9t9reG0c7ScX9n_xrEA@mail.gmail.com
* Treat ETIMEDOUT as indicating a non-recoverable connection failure.Tom Lane2021-09-30
| | | | | | | | | | | | | | | | | | | | | | Add ETIMEDOUT to ALL_CONNECTION_FAILURE_ERRNOS' list of "errnos that identify hard failure of a previously-established network connection". While one could imagine that this is sometimes recoverable, the same could be said of other entries such as ENETDOWN. In support of this, handle ETIMEDOUT on par with other socket errors in relevant infrastructure, such as TranslateSocketError(). (I made a couple of cosmetic adjustments in TranslateSocketError(), too.) The code now assumes that ETIMEDOUT is defined everywhere, which it should be given that POSIX has required it since SUSv2. Perhaps this should be back-patched, but I'm hesitant to do so given the lack of previous complaints, and the hazard that there's a small ABI break on Windows from redefining the symbol. Even if we decide to do that, it'd be prudent to let this bake awhile in HEAD first. Jelte Fennema Discussion: https://postgr.es/m/AM5PR83MB01782BFF2978505F6D6C559AF7AA9@AM5PR83MB0178.EURPRD83.prod.outlook.com
* Fix WAL replay in presence of an incomplete recordAlvaro Herrera2021-09-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Physical replication always ships WAL segment files to replicas once they are complete. This is a problem if one WAL record is split across a segment boundary and the primary server crashes before writing down the segment with the next portion of the WAL record: WAL writing after crash recovery would happily resume at the point where the broken record started, overwriting that record ... but any standby or backup may have already received a copy of that segment, and they are not rewinding. This causes standbys to stop following the primary after the latter crashes: LOG: invalid contrecord length 7262 at A8/D9FFFBC8 because the standby is still trying to read the continuation record (contrecord) for the original long WAL record, but it is not there and it will never be. A workaround is to stop the replica, delete the WAL file, and restart it -- at which point a fresh copy is brought over from the primary. But that's pretty labor intensive, and I bet many users would just give up and re-clone the standby instead. A fix for this problem was already attempted in commit 515e3d84a0b5, but it only addressed the case for the scenario of WAL archiving, so streaming replication would still be a problem (as well as other things such as taking a filesystem-level backup while the server is down after having crashed), and it had performance scalability problems too; so it had to be reverted. This commit fixes the problem using an approach suggested by Andres Freund, whereby the initial portion(s) of the split-up WAL record are kept, and a special type of WAL record is written where the contrecord was lost, so that WAL replay in the replica knows to skip the broken parts. With this approach, we can continue to stream/archive segment files as soon as they are complete, and replay of the broken records will proceed across the crash point without a hitch. Because a new type of WAL record is added, users should be careful to upgrade standbys first, primaries later. Otherwise they risk the standby being unable to start if the primary happens to write such a record. A new TAP test that exercises this is added, but the portability of it is yet to be seen. This has been wrong since the introduction of physical replication, so backpatch all the way back. In stable branches, keep the new XLogReaderState members at the end of the struct, to avoid an ABI break. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Nathan Bossart <bossartn@amazon.com> Discussion: https://postgr.es/m/202108232252.dh7uxf6oxwcy@alvherre.pgsql
* Clarify use of "statistics objects" in the codeMichael Paquier2021-09-29
| | | | | | | | | | | | | | | The code inconsistently used "statistic object" or "statistics" where the correct term, as discussed, is actually "statistics object". This improves the state of the code to be more consistent. While on it, fix an incorrect error message introduced in a4d75c8. This error should never happen, as the code states, but it would be misleading. Author: Justin Pryzby Reviewed-by: Álvaro Herrera, Michael Paquier Discussion: https://postgr.es/m/20210924215827.GS831@telsasoft.com Backpatch-through: 14
* Refactor output file handling when forking syslogger under EXEC_BACKENDMichael Paquier2021-09-29
| | | | | | | | | | | | | | | | | | | | | | | | | A forked logging collector in EXEC_BACKEND builds passes down file descriptors (or HANDLEs in WIN32) through a command for files to be reopened (for stderr and csvlog). Some of its logic was duplicated, and this commit refactors the code with some wrapper routines for file reopening after forking and fd grabbing when building the command for the fork. While on it, this simplifies a use of "long" in the code, introduced by ab0ba6e to take care of a warning related to MinGW-W64 when mapping a intptr_t to a printed value. "long" is 32-bit long on Windows, and interoperability of Win32 and Win64 ensures that handles are always 32-bit significant, so we can just use "int" for the same result. This also makes the new routines more symmetric. This change makes easier the introduction of new log destinations in the logging collector, and this is not the only piece of refactoring planned. I have tested this change with EXEC_BACKEND on linux, macos, and of course MSVC (both Win32 and Win64), but not MinGW so the buildfarm may have something to say here. Author: Sehrope Sarkuni, Michael Paquier Discussion: https://postgr.es/m/CAH7T-aqswBM6JWe4pDehi1uOiufqe06DJWaU5=X7dDLyqUExHg@mail.gmail.com
* Fix typos and grammar in code commentsMichael Paquier2021-09-27
| | | | | | | | | | | Several mistakes have piled in the code comments over the time, including incorrect grammar, function names and simple typos. This commit takes care of a portion of these. No backpatch is done as this is only cosmetic. Author: Justin Pryzby Discussion: https://postgr.es/m/20210924215827.GS831@telsasoft.com
* Remove unneeded nbtree latestRemovedXid comments.Peter Geoghegan2021-09-26
| | | | | | | | | | Discussing the low level issue of nbtree VACUUM and recovery conflicts in btvacuumpage() now seems inappropriate. The same issue is discussed in nbtxlog.h, as well as in a comment block above _bt_delitems_vacuum(). The comment block made more sense when it was part of a broader discussion of nbtree VACUUM "pin scans". These were removed by commit 9f83468b.
* Track LLVM 14 API changes.Thomas Munro2021-09-27
| | | | | | | | | Only done on the master branch for now to fix build farm animal seawasp (which tests bleeeding edge PostgreSQL with bleeding edge LLVM). We can back-patch a consolidated fix closer to LLVM 14's release, once its API has stopped moving around. Discussion: https://postgr.es/m/CA%2BhUKGL%3Dyg6qqgg6W6SAuvRQejditeoDNy-X3b9H_6Fnw8j5Wg%40mail.gmail.com
* Avoid unnecessary division in interval_cmp_value().Tom Lane2021-09-26
| | | | | | | | | Splitting the time field into days and microseconds is pretty useless when we're just going to recombine those values. It's unclear if anyone will notice the speedup in real-world cases, but a cycle shaved is a cycle earned. Discussion: https://postgr.es/m/2629129.1632675713@sss.pgh.pa.us
* Update obsolete nbtree deletion comments.Peter Geoghegan2021-09-25
| | | | | | | | _bt_delitems_delete() is no longer the high-level entry point used by index tuple deletion driven by index tuples whose LP_DEAD bits are set (now called "simple index tuple deletion"). It became a lower level routine that's only called by _bt_delitems_delete_check() following commit d168b66682.
* vacuumlazy.c: Remove obsolete 'onecall' comment.Peter Geoghegan2021-09-25
| | | | | | | | | | | | Remove obsolete reference to lazy_vacuum()'s onecall argument. The function argument was removed by commit 3499df0dee. Also remove adjoining comment block that introduces the wraparound failsafe concept. Talking about the failsafe here no longer makes sense, since lazy_vacuum() (and related functions) are no longer the only place where the failsafe might be triggered. This has been the case since commit c242baa4a8 taught VACUUM to consider triggering the failsafe mechanism during its initial heap scan.
* nbtree README: Add note about latestRemovedXid.Peter Geoghegan2021-09-24
| | | | | | | | | | | Point out that index tuple deletion generally needs a latestRemovedXid value for the deletion operation's WAL record. This is bound to be the most expensive part of the whole deletion operation now that it takes place up front, during original execution. This was arguably an oversight in commit 558a9165e08, which moved the work required to generate these values from index deletion REDO routines to original execution of index deletion operations.
* Release memory allocated by dependency_degreeTomas Vondra2021-09-23
| | | | | | | | | | | | | | | | | Calculating degree of a functional dependency may allocate a lot of memory - we have released mot of the explicitly allocated memory, but e.g. detoasted varlena values were left behind. That may be an issue, because we consider a lot of dependencies (all combinations), and the detoasting may happen for each one again. Fixed by calling dependency_degree() in a dedicated context, and resetting it after each call. We only need the calculated dependency degree, so we don't need to copy anything. Backpatch to PostgreSQL 10, where extended statistics were introduced. Backpatch-through: 10 Discussion: https://www.postgresql.org/message-id/20210915200928.GP831%40telsasoft.com
* Free memory after building each statistics objectTomas Vondra2021-09-23
| | | | | | | | | | | | | | | | | | | | | | | | Until now, all extended statistics on a given relation were built in the same memory context, without resetting. Some of the memory was released explicitly, but not all of it - for example memory allocated while detoasting values is hard to free. This is how it worked since extended statistics were introduced in PostgreSQL 10, but adding support for extended stats on expressions made the issue somewhat worse as it increases the number of statistics to build. Fixed by adding a memory context which gets reset after building each statistics object (all the statistics kinds included in it). Resetting it after building each statistics kind would be even better, but it would require more invasive changes and copying of results, making it harder to backpatch. Backpatch to PostgreSQL 10, where extended statistics were introduced. Author: Justin Pryzby Reported-by: Justin Pryzby Reviewed-by: Tomas Vondra Backpatch-through: 10 Discussion: https://www.postgresql.org/message-id/20210915200928.GP831%40telsasoft.com
* Document issue with heapam line pointer truncation.Peter Geoghegan2021-09-22
| | | | | | | | | | | | Checking that an offset number isn't past the end of a heap page's line pointer array was just a defensive sanity check for HOT-chain traversal code before commit 3c3b8a4b. It's etrictly necessary now, though. Add comments that reference the issue to code in heapam that needs to get it right. Per suggestion from Alexander Lakhin. Discussion: https://postgr.es/m/f76a292c-9170-1aef-91a0-59d9443b99a3@gmail.com
* Make use of PG_INT64_MAX/PG_INT64_MINPeter Eisentraut2021-09-22
| | | | | This code was written before those symbols were introduced, but now we can simplify it.
* Invalidate all partitions for a partitioned table in publication.Amit Kapila2021-09-22
| | | | | | | | | | | | | | | | | Updates/Deletes on a partition were allowed even without replica identity after the parent table was added to a publication. This would later lead to an error on subscribers. The reason was that we were not invalidating the partition's relcache and the publication information for partitions was not getting rebuilt. Similarly, we were not invalidating the partitions' relcache after dropping a partitioned table from a publication which will prohibit Updates/Deletes on its partition without replica identity even without any publication. Reported-by: Haiying Tang Author: Hou Zhijie and Vignesh C Reviewed-by: Vignesh C and Amit Kapila Backpatch-through: 13 Discussion: https://postgr.es/m/OS0PR01MB6113D77F583C922F1CEAA1C3FBD29@OS0PR01MB6113.jpnprd01.prod.outlook.com
* Add parent table name in an error in reorderbuffer.c.Amit Kapila2021-09-22
| | | | | | | | | This can help in troubleshooting the cause of a particular error that can occur during decoding. Author: Jeremy Schneider Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/808ed65b-994c-915a-361c-577f088b837f@amazon.com
* Fix "single value strategy" index deletion issue.Peter Geoghegan2021-09-21
| | | | | | | | | | | | | | | | | | | | | | | | It is not appropriate for deduplication to apply single value strategy when triggered by a bottom-up index deletion pass. This wastes cycles because later bottom-up deletion passes will overinterpret older duplicate tuples that deduplication actually just skipped over "by design". It also makes bottom-up deletion much less effective for low cardinality indexes that happen to cross a meaningless "index has single key value per leaf page" threshold. To fix, slightly narrow the conditions under which deduplication's single value strategy is considered. We already avoided the strategy for a unique index, since our high level goal must just be to buy time for VACUUM to run (not to buy space). We'll now also avoid it when we just had a bottom-up pass that reported failure. The two cases share the same high level goal, and already overlapped significantly, so this approach is quite natural. Oversight in commit d168b666, which added bottom-up index deletion. Author: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/CAH2-WznaOvM+Gyj-JQ0X=JxoMDxctDTYjiEuETdAGbF5EUc3MA@mail.gmail.com Backpatch: 14-, where bottom-up deletion was introduced.
* Fix misevaluation of STABLE parameters in CALL within plpgsql.Tom Lane2021-09-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Before commit 84f5c2908, a STABLE function in a plpgsql CALL statement's argument list would see an up-to-date snapshot, because exec_stmt_call would push a new snapshot. I got rid of that because the possibility of the snapshot disappearing within COMMIT made it too hard to manage a snapshot across the CALL statement. That's fine so far as the procedure itself goes, but I forgot to think about the possibility of STABLE functions within the CALL argument list. As things now stand, those'll be executed with the Portal's snapshot as ActiveSnapshot, keeping them from seeing updates more recent than Portal startup. (VOLATILE functions don't have a problem because they take their own snapshots; which indeed is also why the procedure itself doesn't have a problem. There are no STABLE procedures.) We can fix this by pushing a new snapshot transiently within ExecuteCallStmt itself. Popping the snapshot before we get into the procedure proper eliminates the management problem. The possibly-useless extra snapshot-grab is slightly annoying, but it's no worse than what happened before 84f5c2908. Per bug #17199 from Alexander Nawratil. Back-patch to v11, like the previous patch. Discussion: https://postgr.es/m/17199-1ab2561f0d94af92@postgresql.org
* Document XLOG_INCLUDE_XID a little betterAlvaro Herrera2021-09-21
| | | | | | | | | | | | I noticed that commit 0bead9af484c left this flag undocumented in XLogSetRecordFlags, which led me to discover that the flag doesn't actually do what the one comment on it said it does. Improve the situation by adding some more comments. Backpatch to 14, where the aforementioned commit appears. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/202109212119.c3nhfp64t2ql@alvherre.pgsql
* Introduce GUC shared_memory_size_in_huge_pagesMichael Paquier2021-09-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This runtime-computed GUC shows the number of huge pages required for the server's main shared memory area, taking advantage of the work done in 0c39c29 and 0bd305e. This is useful for users to estimate the amount of huge pages required for a server as it becomes possible to do an estimation without having to start the server and potentially allocate a large chunk of shared memory. The number of huge pages is calculated based on the existing GUC huge_page_size if set, or by using the system's default by looking at /proc/meminfo on Linux. There is nothing new here as this commit reuses the existing calculation methods, and just exposes this information directly to the user. The routine calculating the huge page size is refactored to limit the number of files with platform-specific flags. This new GUC's name was the most popular choice based on the discussion done. This is only supported on Linux. I have taken the time to test the change on Linux, Windows and MacOS, though for the last two ones large pages are not supported. The first one calculates correctly the number of pages depending on the existing GUC huge_page_size or the system's default. Thanks to Andres Freund, Robert Haas, Kyotaro Horiguchi, Tom Lane, Justin Pryzby (and anybody forgotten here) for the discussion. Author: Nathan Bossart Discussion: https://postgr.es/m/F2772387-CE0F-46BF-B5F1-CC55516EB885@amazon.com
* Remove overzealous index deletion assertion.Peter Geoghegan2021-09-20
| | | | | | | | | | | | | | | | | | | | A broken HOT chain is not an unexpected condition, even when the offset number points past the end of the page's line pointer array. heap_prune_chain() does not (and never has) treated this condition as unexpected, so derivative code in heap_index_delete_tuples() shouldn't do so either. Oversight in commit 4228817449. The assertion can probably only fail on Postgres 14 and master. Earlier releases don't have commit 3c3b8a4b, which taught VACUUM to truncate the line pointer array of heap pages. Backpatch all the same, just to be consistent. Author: Peter Geoghegan <pg@bowt.ie> Reported-By: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/17197-9438f31f46705182@postgresql.org Backpatch: 12-, just like commit 4228817449.
* pgstat: Prepare to use mechanism for truncated rels also for droppped rels.Andres Freund2021-09-20
| | | | | | | | | | | | | The upcoming shared memory stats patch drops stats for dropped objects in a transactional manner, rather than removing them later as part of vacuum. This means that stats for DROP inside a transaction needs to handle aborted (sub-)transactions similar to TRUNCATE: The stats up to the DROP should be restored. Rename the existing infrastructure in preparation. Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20210405092914.mmxqe7j56lsjfsej@alap3.anarazel.de
* pgstat: Split out relation stats handling from AtEO[Sub]Xact_PgStat() etc.Andres Freund2021-09-20
| | | | | | | | | An upcoming patch will add additional work to these functions. To avoid the functions getting too complicated / doing too many things at once, split out sub-tasks into their own functions. Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20210405092914.mmxqe7j56lsjfsej@alap3.anarazel.de
* Disallow extended statistics on system columnsTomas Vondra2021-09-20
| | | | | | | | | | | | | | | | | | | Since introduction of extended statistics, we've disallowed references to system columns. So for example CREATE STATISTICS s ON ctid FROM t; would fail. But with extended statistics on expressions, it was possible to work around this limitation quite easily CREATE STATISTICS s ON (ctid::text) FROM t; This is an oversight in a4d75c86bf, fixed by adding a simple check. Backpatch to PostgreSQL 14, where support for extended statistics on expressions was introduced. Backpatch-through: 14 Discussion: https://postgr.es/m/20210816013255.GS10479%40telsasoft.com