aboutsummaryrefslogtreecommitdiff
path: root/src/backend/tcop
Commit message (Collapse)AuthorAge
...
* Add pg_analyze_and_rewrite_varparams()Peter Eisentraut2022-03-07
| | | | | | | | | | | | | | | | | | | | | This new function extracts common code from PrepareQuery() and exec_parse_message(). It is then exactly analogous to the existing pg_analyze_and_rewrite_fixedparams() and pg_analyze_and_rewrite_withcb(). To unify these two code paths, this makes PrepareQuery() now subject to log_parser_stats. Also, both paths now invoke TRACE_POSTGRESQL_QUERY_REWRITE_START(). PrepareQuery() no longer checks whether a utility statement was specified. The grammar doesn't allow that anyway, and exec_parse_message() supports it, so restricting it doesn't seem necessary. This also adds QueryEnvironment support to the *varparams functions, for consistency with its cousins, even though it is not used right now. Reviewed-by: Nathan Bossart <bossartn@amazon.com> Discussion: https://www.postgresql.org/message-id/flat/c67ce276-52b4-0239-dc0e-39875bf81840@enterprisedb.com
* Parse/analyze function renamingPeter Eisentraut2022-03-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are three parallel ways to call parse/analyze: with fixed parameters, with variable parameters, and by supplying your own parser callback. Some of the involved functions were confusingly named and made this API structure more confusing. This patch renames some functions to make this clearer: parse_analyze() -> parse_analyze_fixedparams() pg_analyze_and_rewrite() -> pg_analyze_and_rewrite_fixedparams() (Otherwise one might think this variant doesn't accept parameters, but in fact all three ways accept parameters.) pg_analyze_and_rewrite_params() -> pg_analyze_and_rewrite_withcb() (Before, and also when considering pg_analyze_and_rewrite(), one might think this is the only way to pass parameters. Moreover, the parser callback doesn't necessarily need to parse only parameters, it's just one of the things it could do.) parse_fixed_parameters() -> setup_parse_fixed_parameters() parse_variable_parameters() -> setup_parse_variable_parameters() (These functions don't actually do any parsing, they just set up callbacks to use during parsing later.) This patch also adds some const decorations to the fixed-parameters API, so the distinction from the variable-parameters API is more clear. Reviewed-by: Nathan Bossart <bossartn@amazon.com> Discussion: https://www.postgresql.org/message-id/flat/c67ce276-52b4-0239-dc0e-39875bf81840@enterprisedb.com
* Fix SPI's handling of errors during transaction commit.Tom Lane2022-02-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SPI_commit previously left it up to the caller to recover from any error occurring during commit. Since that's complicated and requires use of low-level xact.c facilities, it's not too surprising that no caller got it right. Let's move the responsibility for cleanup into spi.c. Doing that requires redefining SPI_commit as starting a new transaction, so that it becomes equivalent to SPI_commit_and_chain except that you get default transaction characteristics instead of preserving the prior transaction's characteristics. We can make this pretty transparent API-wise by redefining SPI_start_transaction() as a no-op. Callers that expect to do something in between might be surprised, but available evidence is that no callers do so. Having made that API redefinition, we can fix this mess by having SPI_commit[_and_chain] trap errors and start a new, clean transaction before re-throwing the error. Likewise for SPI_rollback[_and_chain]. Some cleanup is also needed in AtEOXact_SPI, which was nowhere near smart enough to deal with SPI contexts nested inside a committing context. While plperl and pltcl need no changes beyond removing their now-useless SPI_start_transaction() calls, plpython needs some more work because it hadn't gotten the memo about catching commit/rollback errors in the first place. Such an error resulted in longjmp'ing out of the Python interpreter, which leaks Python stack entries at present and is reported to crash Python 3.11 altogether. Add the missing logic to catch such errors and convert them into Python exceptions. We are probably going to have to back-patch this once Python 3.11 ships, but it's a sufficiently basic change that I'm a bit nervous about doing so immediately. Let's let it bake awhile in HEAD first. Peter Eisentraut and Tom Lane Discussion: https://postgr.es/m/3375ffd8-d71c-2565-e348-a597d6e739e3@enterprisedb.com Discussion: https://postgr.es/m/17416-ed8fe5d7213d6c25@postgresql.org
* Suppress warning about stack_base_ptr with late-model GCC.Tom Lane2022-02-17
| | | | | | | | | | | | | | | | | | | | GCC 12 complains that set_stack_base is storing the address of a local variable in a long-lived pointer. This is an entirely reasonable warning (indeed, it just helped us find a bug); but that behavior is intentional here. We can work around it by using __builtin_frame_address(0) instead of a specific local variable; that produces an address a dozen or so bytes different, in my testing, but we don't care about such a small difference. Maybe someday a compiler lacking that function will start to issue a similar warning, but we'll worry about that when it happens. Patch by me, per a suggestion from Andres Freund. Back-patch to v12, which is as far back as the patch will go without some pain. (Recently-established project policy would permit a back-patch as far as 9.2, but I'm disinclined to expend the work until GCC 12 is much more widespread.) Discussion: https://postgr.es/m/3773792.1645141467@sss.pgh.pa.us
* Move replication slot release to before_shmem_exit().Andres Freund2022-02-14
| | | | | | | | | | | | | | | | | | | | | | Previously, replication slots were released in ProcKill() on error, resulting in reporting replication slot drop of ephemeral slots after the stats subsystem was already shut down. To fix this problem, move replication slot release to a before_shmem_exit() hook that is called before the stats collector shuts down. There wasn't really a good reason for the slot handling to be in ProcKill() anyway. Patch by Masahiko Sawada, with very minor polishing by me. I, Andres, wrote a test for dropping slots during process exit, but there may be some OS dependent issues around the number of times FATAL error messages are displayed due to a still debated libpq issue. So that test will be committed separately / later. Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-By: Andres Freund <andres@anarazel.de> Author: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/CAD21AoDAeEpAbZEyYJsPZJUmSPaRicVSBObaL7sPaofnKz+9zg@mail.gmail.com
* Database-level collation version trackingPeter Eisentraut2022-02-14
| | | | | | | | | | | | | | | | | | This adds to database objects the same version tracking that collation objects have. There is a new pg_database column datcollversion that stores the version, a new function pg_database_collation_actual_version() to get the version from the operating system, and a new subcommand ALTER DATABASE ... REFRESH COLLATION VERSION. This was not originally added together with pg_collation.collversion, since originally version tracking was only supported for ICU, and ICU on a database-level is not currently supported. But we now have version tracking for glibc (since PG13), FreeBSD (since PG14), and Windows (since PG13), so this is useful to have now. Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/f0ff3190-29a3-5b39-a179-fa32eee57db6%40enterprisedb.com
* Update copyright for 2022Bruce Momjian2022-01-07
| | | | Backpatch-through: 10
* Replace random(), pg_erand48(), etc with a better PRNG API and algorithm.Tom Lane2021-11-28
| | | | | | | | | | | | | | | | | | | Standardize on xoroshiro128** as our basic PRNG algorithm, eliminating a bunch of platform dependencies as well as fundamentally-obsolete PRNG code. In addition, this API replacement will ease replacing the algorithm again in future, should that become necessary. xoroshiro128** is a few percent slower than the drand48 family, but it can produce full-width 64-bit random values not only 48-bit, and it should be much more trustworthy. It's likely to be noticeably faster than the platform's random(), depending on which platform you are thinking about; and we can have non-global state vectors easily, unlike with random(). It is not cryptographically strong, but neither are the functions it replaces. Fabien Coelho, reviewed by Dean Rasheed, Aleksander Alekseev, and myself Discussion: https://postgr.es/m/alpine.DEB.2.22.394.2105241211230.165418@pseudo
* Add pg_checkpointer predefined role for CHECKPOINT command.Jeff Davis2021-11-09
| | | | | | | | Any user with the privileges of pg_checkpointer can issue a CHECKPOINT command. Reviewed-by: Stephen Frost Discussion: https://postgr.es/m/67a1d667e8ec228b5e07f232184c80348c5d93f4.camel%40j-davis.com
* Fix Portal snapshot tracking to handle subtransactions properly.Tom Lane2021-10-01
| | | | | | | | | | | | | | | | | | | | | | | Commit 84f5c2908 forgot to consider the possibility that EnsurePortalSnapshotExists could run inside a subtransaction with lifespan shorter than the Portal's. In that case, the new active snapshot would be popped at the end of the subtransaction, leaving a dangling pointer in the Portal, with mayhem ensuing. To fix, make sure the ActiveSnapshot stack entry is marked with the same subtransaction nesting level as the associated Portal. It's certainly safe to do so since we won't be here at all unless the stack is empty; hence we can't create an out-of-order stack. Let's also apply this logic in the case where PortalRunUtility sets portalSnapshot, just to be sure that path can't cause similar problems. It's slightly less clear that that path can't create an out-of-order stack, so add an assertion guarding it. Report and patch by Bertrand Drouvot (with kibitzing by me). Back-patch to v11, like the previous commit. Discussion: https://postgr.es/m/ff82b8c5-77f4-3fe7-6028-fcf3303e82dd@amazon.com
* process startup: Split single user code out of PostgresMain().Andres Freund2021-09-17
| | | | | | | | | | | | | | | It was harder than necessary to understand PostgresMain() because the code for a normal backend was interspersed with single-user mode specific code. Split most of the single-user mode code into its own function PostgresSingleUserMain(), that does all the necessary setup for single-user mode, and then hands off after that to PostgresMain(). There still is some single-user mode code in InitPostgres(), and it'd likely be worth moving at least some of it out. But that's for later. Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20210802164124.ufo5buo4apl6yuvs@alap3.anarazel.de
* process startup: Do InitProcess() at the same time regardless of EXEC_BACKEND.Andres Freund2021-09-16
| | | | | | | | | An upcoming patch splits single user mode into its own function. This makes that easier. Split out for easier review / testing. Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20210802164124.ufo5buo4apl6yuvs@alap3.anarazel.de
* Fix performance regression from session statistics.Andres Freund2021-09-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Session statistics, as introduced by 960869da08, had several shortcomings: - an additional GetCurrentTimestamp() call that also impaired the accuracy of the data collected This can be avoided by passing the current timestamp we already have in pgstat_report_stat(). - an additional statistics UDP packet sent every 500ms This is solved by adding the new statistics to PgStat_MsgTabstat. This is conceptually ugly, because session statistics are not table statistics. But the struct already contains data unrelated to tables, so there is not much damage done. Connection and disconnection are reported in separate messages, which reduces the number of additional messages to two messages per session and a slight increase in PgStat_MsgTabstat size (but the same number of table stats fit). - Session time computation could overflow on systems where long is 32 bit. Reported-By: Andres Freund <andres@anarazel.de> Author: Andres Freund <andres@anarazel.de> Author: Laurenz Albe <laurenz.albe@cybertec.at> Discussion: https://postgr.es/m/20210801205501.nyxzxoelqoo4x2qc%40alap3.anarazel.de Backpatch: 14-, where the feature was introduced.
* process startup: Initialize PgStartTime earlier in single user mode.Andres Freund2021-09-15
| | | | | | | | | | | | An upcoming patch splits single user mode handling out of PostgresMain(). The startup time only needs to be determined in single user mode. Currently the initialization happens late, which makes the split a bit harder. As postmaster determines the time earlier it makes sense to move the time for single user mode to a roughly similar point in time. Reviewd-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20210802164124.ufo5buo4apl6yuvs@alap3.anarazel.de
* Disallow LISTEN in background workers.Tom Lane2021-09-15
| | | | | | | | | | | | | | | | | | | It's possible to execute user-defined SQL in some background processes; for example, logical replication workers can fire triggers. This opens the possibility that someone would try to execute LISTEN in such a context. But since only regular backends ever call ProcessNotifyInterrupt, no messages would actually be received, and thus the registered listener would simply prevent the message queue from being cleaned. Eventually NOTIFY would stop working, which is bad. Perhaps someday somebody will invent infrastructure to make listening in a background worker actually useful. In the meantime, forbid it. Back-patch to v13, which is where we introduced the MyBackendType variable. It'd be a lot harder to implement the check without that, and it doesn't seem worth the trouble. Discussion: https://postgr.es/m/153243441449.1404.2274116228506175596@wrigleys.postgresql.org
* Send NOTIFY signals during CommitTransaction.Tom Lane2021-09-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Formerly, we sent signals for outgoing NOTIFY messages within ProcessCompletedNotifies, which was also responsible for sending relevant ones of those messages to our connected client. It therefore had to run during the main-loop processing that occurs just before going idle. This arrangement had two big disadvantages: * Now that procedures allow intra-command COMMITs, it would be useful to send NOTIFYs to other sessions immediately at COMMIT (though, for reasons of wire-protocol stability, we still shouldn't forward them to our client until end of command). * Background processes such as replication workers would not send NOTIFYs at all, since they never execute the client communication loop. We've had requests to allow triggers running in replication workers to send NOTIFYs, so that's a problem. To fix these things, move transmission of outgoing NOTIFY signals into AtCommit_Notify, where it will happen during CommitTransaction. Also move the possible call of asyncQueueAdvanceTail there, to ensure we don't bloat the async SLRU if a background worker sends many NOTIFYs with no one listening. We can also drop the call of asyncQueueReadAllNotifications, allowing ProcessCompletedNotifies to go away entirely. That's because commit 790026972 added a call of ProcessNotifyInterrupt adjacent to PostgresMain's call of ProcessCompletedNotifies, and that does its own call of asyncQueueReadAllNotifications, meaning that we were uselessly doing two such calls (inside two separate transactions) whenever inbound notify signals coincided with an outbound notify. We need only set notifyInterruptPending to ensure that ProcessNotifyInterrupt runs, and we're done. The existing documentation suggests that custom background workers should call ProcessCompletedNotifies if they want to send NOTIFY messages. To avoid an ABI break in the back branches, reduce it to an empty routine rather than removing it entirely. Removal will occur in v15. Although the problems mentioned above have existed for awhile, I don't feel comfortable back-patching this any further than v13. There was quite a bit of churn in adjacent code between 12 and 13. At minimum we'd have to also backpatch 51004c717, and a good deal of other adjustment would also be needed, so the benefit-to-risk ratio doesn't look attractive. Per bug #15293 from Michael Powers (and similar gripes from others). Artur Zakirov and Tom Lane Discussion: https://postgr.es/m/153243441449.1404.2274116228506175596@wrigleys.postgresql.org
* Fix some anomalies with NO SCROLL cursors.Tom Lane2021-09-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have long forbidden fetching backwards from a NO SCROLL cursor, but the prohibition didn't extend to cases in which we rewind the query altogether and then re-fetch forwards. I think the reason is that this logic was mainly meant to protect plan nodes that can't be run in the reverse direction. However, re-reading the query output is problematic if the query is volatile (which includes SELECT FOR UPDATE, not just queries with volatile functions): the re-read can produce different results, which confuses the cursor navigation logic completely. Another reason for disliking this approach is that some code paths will either fetch backwards or rewind-and-fetch-forwards depending on the distance to the target row; so that seemingly identical use-cases may or may not draw the "cursor can only scan forward" error. Hence, let's clean things up by disallowing rewind as well as fetch-backwards in a NO SCROLL cursor. Ordinarily we'd only make such a definitional change in HEAD, but there is a third reason to consider this change now. Commit ba2c6d6ce created some new user-visible anomalies for non-scrollable cursors WITH HOLD, in that navigation in the cursor result got confused if the cursor had been partially read before committing. The only good way to resolve those anomalies is to forbid rewinding such a cursor, which allows removal of the incorrect cursor state manipulations that ba2c6d6ce added to PersistHoldablePortal. To minimize the behavioral change in the back branches (including v14), refuse to rewind a NO SCROLL cursor only when it has a holdStore, ie has been held over from a previous transaction due to WITH HOLD. This should avoid breaking most applications that have been sloppy about whether to declare cursors as scrollable. We'll enforce the prohibition across-the-board beginning in v15. Back-patch to v11, as ba2c6d6ce was. Discussion: https://postgr.es/m/3712911.1631207435@sss.pgh.pa.us
* process startup: Always call Init[Auxiliary]Process() before BaseInit().Andres Freund2021-08-05
| | | | | | | | | | | | | | | | | | | | | For EXEC_BACKEND InitProcess()/InitAuxiliaryProcess() needs to have been called well before we call BaseInit(), as SubPostmasterMain() needs LWLocks to work. Having the order of initialization differ between platforms makes it unnecessarily hard to understand the system and to add initialization points for new subsystems without a lot of duplication. To be able to change the order, BaseInit() cannot trigger CreateSharedMemoryAndSemaphores() anymore - obviously that needs to have happened before we can call InitProcess(). It seems cleaner to create shared memory explicitly in single user/bootstrap mode anyway. After this change the separation of bufmgr initialization into InitBufferPoolAccess() / InitBufferPoolBackend() is not meaningful anymore so the latter is removed. Author: Andres Freund <andres@anarazel.de> Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20210802164124.ufo5buo4apl6yuvs@alap3.anarazel.de
* Use elog, not Assert, to report failure to provide an outer snapshot.Tom Lane2021-07-31
| | | | | | | | | | | | | | | | As of commit 84f5c2908, executing SQL commands (via SPI or otherwise) requires having either an active Portal, or a caller-established active snapshot. We were simply Assert'ing that that's the case. But we've now had a couple different reports of people testing extensions that didn't meet this requirement, and were confused by the resulting crash. Let's convert the Assert to a test-and-elog, in hopes of making the issue clearer for extension authors. Per gripes from Liu Huailing and RekGRpth. Back-patch to v11, like the prior commit. Discussion: https://postgr.es/m/OSZPR01MB6215671E3C5956A034A080DFBEEC9@OSZPR01MB6215.jpnprd01.prod.outlook.com Discussion: https://postgr.es/m/17035-14607d308ac8643c@postgresql.org
* Use l*_node() family of functions where appropriatePeter Eisentraut2021-07-19
| | | | | | | Instead of castNode(…, lfoo(…)) Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://www.postgresql.org/message-id/flat/87eecahraj.fsf@wibble.ilmari.org
* Improve reporting of "conflicting or redundant options" errors.Dean Rasheed2021-07-15
| | | | | | | | | | | | | | | | | | | | | When reporting "conflicting or redundant options" errors, try to ensure that errposition() is used, to help the user identify the offending option. Formerly, errposition() was invoked in less than 60% of cases. This patch raises that to over 90%, but there remain a few places where the ParseState is not readily available. Using errdetail() might improve the error in such cases, but that is left as a task for the future. Additionally, since this error is thrown from over 100 places in the codebase, introduce a dedicated function to throw it, reducing code duplication. Extracted from a slightly larger patch by Vignesh C. Reviewed by Bharath Rupireddy, Alvaro Herrera, Dilip Kumar, Hou Zhijie, Peter Smith, Daniel Gustafsson, Julien Rouhaud and me. Discussion: https://postgr.es/m/CALDaNm33FFSS5tVyvmkoK2cCMuDVxcui=gFrjti9ROfynqSAGA@mail.gmail.com
* Message style improvementsPeter Eisentraut2021-06-28
|
* Centralize the logic for protective copying of utility statements.Tom Lane2021-06-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the "simple Query" code path, it's fine for parse analysis or execution of a utility statement to scribble on the statement's node tree, since that'll just be thrown away afterwards. However it's not fine if the node tree is in the plan cache, as then it'd be corrupted for subsequent executions. Up to now we've dealt with that by having individual utility-statement functions apply copyObject() if they were going to modify the tree. But that's prone to errors of omission. Bug #17053 from Charles Samborski shows that CREATE/ALTER DOMAIN didn't get this memo, and can crash if executed repeatedly from plan cache. In the back branches, we'll just apply a narrow band-aid for that, but in HEAD it seems prudent to have a more principled fix that will close off the possibility of other similar bugs in future. Hence, let's hoist the responsibility for doing copyObject up into ProcessUtility from its children, thus ensuring that it happens for all utility statement types. Also, modify ProcessUtility's API so that its callers can tell it whether a copy step is necessary. It turns out that in all cases, the immediate caller knows whether the node tree is transient, so this doesn't involve a huge amount of code thrashing. In this way, while we lose a little bit in the execute-from-cache code path due to sometimes copying node trees that wouldn't be mutated anyway, we gain something in the simple-Query code path by not copying throwaway node trees. Statements that are complex enough to be expensive to copy are almost certainly ones that would have to be copied anyway, so the loss in the cache code path shouldn't be much. (Note that this whole problem applies only to utility statements. Optimizable statements don't have the issue because we long ago made the executor treat Plan trees as read-only. Perhaps someday we will make utility statement execution act likewise, but I'm not holding my breath.) Discussion: https://postgr.es/m/931771.1623893989@sss.pgh.pa.us Discussion: https://postgr.es/m/17053-3ca3f501bbc212b4@postgresql.org
* Restore the portal-level snapshot after procedure COMMIT/ROLLBACK.Tom Lane2021-05-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | COMMIT/ROLLBACK necessarily destroys all snapshots within the session. The original implementation of intra-procedure transactions just cavalierly did that, ignoring the fact that this left us executing in a rather different environment than normal. In particular, it turns out that handling of toasted datums depends rather critically on there being an outer ActiveSnapshot: otherwise, when SPI or the core executor pop whatever snapshot they used and return, it's unsafe to dereference any toasted datums that may appear in the query result. It's possible to demonstrate "no known snapshots" and "missing chunk number N for toast value" errors as a result of this oversight. Historically this outer snapshot has been held by the Portal code, and that seems like a good plan to preserve. So add infrastructure to pquery.c to allow re-establishing the Portal-owned snapshot if it's not there anymore, and add enough bookkeeping support that we can tell whether it is or not. We can't, however, just re-establish the Portal snapshot as part of COMMIT/ROLLBACK. As in normal transaction start, acquiring the first snapshot should wait until after SET and LOCK commands. Hence, teach spi.c about doing this at the right time. (Note that this patch doesn't fix the problem for any PLs that try to run intra-procedure transactions without using SPI to execute SQL commands.) This makes SPI's no_snapshots parameter rather a misnomer, so in HEAD, rename that to allow_nonatomic. replication/logical/worker.c also needs some fixes, because it wasn't careful to hold a snapshot open around AFTER trigger execution. That code doesn't use a Portal, which I suspect someday we're gonna have to fix. But for now, just rearrange the order of operations. This includes back-patching the recent addition of finish_estate() to centralize the cleanup logic there. This also back-patches commit 2ecfeda3e into v13, to improve the test coverage for worker.c (it was that test that exposed that worker.c's snapshot management is wrong). Per bug #15990 from Andreas Wicht. Back-patch to v11 where intra-procedure COMMIT was added. Discussion: https://postgr.es/m/15990-eee2ac466b11293d@postgresql.org
* Allow compute_query_id to be set to 'auto' and make it defaultAlvaro Herrera2021-05-15
| | | | | | | | | | | | | | | | | Allowing only on/off meant that all either all existing configuration guides would become obsolete if we disabled it by default, or that we would have to accept a performance loss in the default config if we enabled it by default. By allowing 'auto' as a middle ground, the performance cost is only paid by those who enable pg_stat_statements and similar modules. I only edited the release notes to comment-out a paragraph that is now factually wrong; further edits are probably needed to describe the related change in more detail. Author: Julien Rouhaud <rjuju123@gmail.com> Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> Discussion: https://postgr.es/m/20210513002623.eugftm4nk2lvvks3@nol
* Refactor CHECK_FOR_INTERRUPTS() to add flexibility.Tom Lane2021-05-14
| | | | | | | | | | | | | | | | | | | | Split up CHECK_FOR_INTERRUPTS() to provide an additional macro INTERRUPTS_PENDING_CONDITION(), which just tests whether an interrupt is pending without attempting to service it. This is useful in situations where the caller knows that interrupts are blocked, and would like to find out if it's worth the trouble to unblock them. Also add INTERRUPTS_CAN_BE_PROCESSED(), which indicates whether CHECK_FOR_INTERRUPTS() can be relied on to clear the pending interrupt. This commit doesn't actually add any uses of the new macros, but a follow-on bug fix will do so. Back-patch to all supported branches to provide infrastructure for that fix. Alvaro Herrera and Tom Lane Discussion: https://postgr.es/m/20210513155351.GA7848@alvherre.pgsql
* Initial pgindent and pgperltidy run for v14.Tom Lane2021-05-12
| | | | | | | | Also "make reformat-dat-files". The only change worthy of note is that pgindent messed up the formatting of launcher.c's struct LogicalRepWorkerId, which led me to notice that that struct wasn't used at all anymore, so I just took it out.
* Revert per-index collation version tracking feature.Thomas Munro2021-05-07
| | | | | | | | | | | | | | | | | | | | | | | Design problems were discovered in the handling of composite types and record types that would cause some relevant versions not to be recorded. Misgivings were also expressed about the use of the pg_depend catalog for this purpose. We're out of time for this release so we'll revert and try again. Commits reverted: 1bf946bd: Doc: Document known problem with Windows collation versions. cf002008: Remove no-longer-relevant test case. ef387bed: Fix bogus collation-version-recording logic. 0fb0a050: Hide internal error for pg_collation_actual_version(<bad OID>). ff942057: Suppress "warning: variable 'collcollate' set but not used". d50e3b1f: Fix assertion in collation version lookup. f24b1569: Rethink extraction of collation dependencies. 257836a7: Track collation versions for indexes. cd6f479e: Add pg_depend.refobjversion. 7d1297df: Remove pg_collation.collversion. Discussion: https://postgr.es/m/CA%2BhUKGLhj5t1fcjqAu8iD9B3ixJtsTNqyCCD4V0aTO9kAKAjjA%40mail.gmail.com
* Disallow calling anything but plain functions via the fastpath API.Tom Lane2021-04-30
| | | | | | | | | | | | | | | | | | | | | | | Reject aggregates, window functions, and procedures. Aggregates failed anyway, though with a somewhat obscure error message. Window functions would hit an Assert or null-pointer dereference. Procedures seemed to work as long as you didn't try to do transaction control, but (a) transaction control is sort of the point of a procedure, and (b) it's not entirely clear that no bugs lurk in that path. Given the lack of testing of this area, it seems safest to be conservative in what we support. Also reject proretset functions, as the fastpath protocol can't support returning a set. Also remove an easily-triggered assertion that the given OID isn't 0; the subsequent lookups can handle that case themselves. Per report from Theodor-Arsenij Larionov-Trichkin. Back-patch to all supported branches. (The procedure angle only applies in v11+, of course.) Discussion: https://postgr.es/m/2039442.1615317309@sss.pgh.pa.us
* Add heuristic incoming-message-size limits in the server.Tom Lane2021-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We had a report of confusing server behavior caused by a client bug that sent junk to the server: the server thought the junk was a very long message length and waited patiently for data that would never come. We can reduce the risk of that by being less trusting about message lengths. For a long time, libpq has had a heuristic rule that it wouldn't believe large message size words, except for a small number of message types that are expected to be (potentially) long. This provides some defense against loss of message-boundary sync and other corrupted-data cases. The server does something similar, except that up to now it only limited the lengths of messages received during the connection authentication phase. Let's do the same as in libpq and put restrictions on the allowed length of all messages, while distinguishing between message types that are expected to be long and those that aren't. I used a limit of 10000 bytes for non-long messages. (libpq's corresponding limit is 30000 bytes, but given the asymmetry of the FE/BE protocol, there's no good reason why the numbers should be the same.) Experimentation suggests that this is at least a factor of 10, maybe a factor of 100, more than we really need; but plenty of daylight seems desirable to avoid false positives. In any case we can adjust the limit based on beta-test results. For long messages, set a limit of MaxAllocSize - 1, which is the most that we can absorb into the StringInfo buffer that the message is collected in. This just serves to make sure that a bogus message size is reported as such, rather than as a confusing gripe about not being able to enlarge a string buffer. While at it, make sure that non-mainline code paths (such as COPY FROM STDIN) are as paranoid as SocketBackend is, and validate the message type code before believing the message length. This provides an additional guard against getting stuck on corrupted input. Discussion: https://postgr.es/m/2003757.1619373089@sss.pgh.pa.us
* adjust query id feature to use pg_stat_activity.query_idBruce Momjian2021-04-20
| | | | | | | | | | | Previously, it was pg_stat_activity.queryid to match the pg_stat_statements queryid column. This is an adjustment to patch 4f0b0966c8. This also adjusts some of the internal function calls to match. Catversion bumped. Reported-by: Álvaro Herrera, Julien Rouhaud Discussion: https://postgr.es/m/20210408032704.GA7498@alvherre.pgsql
* SQL-standard function bodyPeter Eisentraut2021-04-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for writing CREATE FUNCTION and CREATE PROCEDURE statements for language SQL with a function body that conforms to the SQL standard and is portable to other implementations. Instead of the PostgreSQL-specific AS $$ string literal $$ syntax, this allows writing out the SQL statements making up the body unquoted, either as a single statement: CREATE FUNCTION add(a integer, b integer) RETURNS integer LANGUAGE SQL RETURN a + b; or as a block CREATE PROCEDURE insert_data(a integer, b integer) LANGUAGE SQL BEGIN ATOMIC INSERT INTO tbl VALUES (a); INSERT INTO tbl VALUES (b); END; The function body is parsed at function definition time and stored as expression nodes in a new pg_proc column prosqlbody. So at run time, no further parsing is required. However, this form does not support polymorphic arguments, because there is no more parse analysis done at call time. Dependencies between the function and the objects it uses are fully tracked. A new RETURN statement is introduced. This can only be used inside function bodies. Internally, it is treated much like a SELECT statement. psql needs some new intelligence to keep track of function body boundaries so that it doesn't send off statements when it sees semicolons that are inside a function body. Tested-by: Jaime Casanova <jcasanov@systemguards.com.ec> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/1c11f1eb-f00c-43b7-799d-2d44132c02d7@2ndquadrant.com
* Make use of in-core query id added by commit 5fd9dfa5f5Bruce Momjian2021-04-07
| | | | | | | | | | | | | | | | | | | | | | | Use the in-core query id computation for pg_stat_activity, log_line_prefix, and EXPLAIN VERBOSE. Similar to other fields in pg_stat_activity, only the queryid from the top level statements are exposed, and if the backends status isn't active then the queryid from the last executed statements is displayed. Add a %Q placeholder to include the queryid in log_line_prefix, which will also only expose top level statements. For EXPLAIN VERBOSE, if a query identifier has been computed, either by enabling compute_query_id or using a third-party module, display it. Bump catalog version. Discussion: https://postgr.es/m/20210407125726.tkvjdbw76hxnpwfi@nol Author: Julien Rouhaud Reviewed-by: Alvaro Herrera, Nitin Jadhav, Zhihong Yu
* Move pg_stat_statements query jumbling to core.Bruce Momjian2021-04-07
| | | | | | | | | | | | | | | | | | | Add compute_query_id GUC to control whether a query identifier should be computed by the core (off by default). It's thefore now possible to disable core queryid computation and use pg_stat_statements with a different algorithm to compute the query identifier by using a third-party module. To ensure that a single source of query identifier can be used and is well defined, modules that calculate a query identifier should throw an error if compute_query_id specified to compute a query id and if a query idenfitier was already calculated. Discussion: https://postgr.es/m/20210407125726.tkvjdbw76hxnpwfi@nol Author: Julien Rouhaud Reviewed-by: Alvaro Herrera, Nitin Jadhav, Zhihong Yu
* Add function to log the memory contexts of specified backend process.Fujii Masao2021-04-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 3e98c0bafb added pg_backend_memory_contexts view to display the memory contexts of the backend process. However its target process is limited to the backend that is accessing to the view. So this is not so convenient when investigating the local memory bloat of other backend process. To improve this situation, this commit adds pg_log_backend_memory_contexts() function that requests to log the memory contexts of the specified backend process. This information can be also collected by calling MemoryContextStats(TopMemoryContext) via a debugger. But this technique cannot be used in some environments because no debugger is available there. So, pg_log_backend_memory_contexts() allows us to see the memory contexts of specified backend more easily. Only superusers are allowed to request to log the memory contexts because allowing any users to issue this request at an unbounded rate would cause lots of log messages and which can lead to denial of service. On receipt of the request, at the next CHECK_FOR_INTERRUPTS(), the target backend logs its memory contexts at LOG_SERVER_ONLY level, so that these memory contexts will appear in the server log but not be sent to the client. It logs one message per memory context. Because if it buffers all memory contexts into StringInfo to log them as one message, which may require the buffer to be enlarged very much and lead to OOM error since there can be a large number of memory contexts in a backend. When a backend process is consuming huge memory, logging all its memory contexts might overrun available disk space. To prevent this, now this patch limits the number of child contexts to log per parent to 100. As with MemoryContextStats(), it supposes that practical cases where the log gets long will typically be huge numbers of siblings under the same parent context; while the additional debugging value from seeing details about individual siblings beyond 100 will not be large. There was another proposed patch to add the function to return the memory contexts of specified backend as the result sets, instead of logging them, in the discussion. However that patch is not included in this commit because it had several issues to address. Thanks to Tatsuhito Kasahara, Andres Freund, Tom Lane, Tomas Vondra, Michael Paquier, Kyotaro Horiguchi and Zhihong Yu for the discussion. Bump catalog version. Author: Atsushi Torikoshi Reviewed-by: Kyotaro Horiguchi, Zhihong Yu, Fujii Masao Discussion: https://postgr.es/m/0271f440ac77f2a4180e0e56ebd944d1@oss.nttdata.com
* Detect POLLHUP/POLLRDHUP while running queries.Thomas Munro2021-04-03
| | | | | | | | | | | | | | | | | | | | | | | | | | Provide a new GUC check_client_connection_interval that can be used to check whether the client connection has gone away, while running very long queries. It is disabled by default. For now this uses a non-standard Linux extension (also adopted by at least one other OS). POLLRDHUP is not defined by POSIX, and other OSes don't have a reliable way to know if a connection was closed without actually trying to read or write. In future we might consider trying to send a no-op/heartbeat message instead, but that could require protocol changes. Author: Sergey Cherkashin <s.cherkashin@postgrespro.ru> Author: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Tatsuo Ishii <ishii@sraoss.co.jp> Reviewed-by: Konstantin Knizhnik <k.knizhnik@postgrespro.ru> Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Maksim Milyutin <milyutinma@gmail.com> Reviewed-by: Tsunakawa, Takayuki/綱川 貴之 <tsunakawa.takay@fujitsu.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> (much earlier version) Discussion: https://postgr.es/m/77def86b27e41f0efcba411460e929ae%40postgrespro.ru
* Extended statistics on expressionsTomas Vondra2021-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to 79f6a942bd. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
* ALTER TABLE ... DETACH PARTITION ... CONCURRENTLYAlvaro Herrera2021-03-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow a partition be detached from its partitioned table without blocking concurrent queries, by running in two transactions and only requiring ShareUpdateExclusive in the partitioned table. Because it runs in two transactions, it cannot be used in a transaction block. This is the main reason to use dedicated syntax: so that users can choose to use the original mode if they need it. But also, it doesn't work when a default partition exists (because an exclusive lock would still need to be obtained on it, in order to change its partition constraint.) In case the second transaction is cancelled or a crash occurs, there's ALTER TABLE .. DETACH PARTITION .. FINALIZE, which executes the final steps. The main trick to make this work is the addition of column pg_inherits.inhdetachpending, initially false; can only be set true in the first part of this command. Once that is committed, concurrent transactions that use a PartitionDirectory will include or ignore partitions so marked: in optimizer they are ignored if the row is marked committed for the snapshot; in executor they are always included. As a result, and because of the way PartitionDirectory caches partition descriptors, queries that were planned before the detach will see the rows in the detached partition and queries that are planned after the detach, won't. A CHECK constraint is created that duplicates the partition constraint. This is probably not strictly necessary, and some users will prefer to remove it afterwards, but if the partition is re-attached to a partitioned table, the constraint needn't be rechecked. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> Discussion: https://postgr.es/m/20200803234854.GA24158@alvherre.pgsql
* Improve logging of bad parameter values in BIND messages.Tom Lane2021-03-16
| | | | | | | | | | | | | | | Since commit ba79cb5dc, values of bind parameters have been logged during errors in extended query mode. However, we only did that after we'd collected and converted all the parameter values, thus failing to offer any useful localization of invalid-parameter problems. Add a separate callback that's used during parameter collection, and have it print the parameter number, along with the input string if text input format is used. Justin Pryzby and Tom Lane Discussion: https://postgr.es/m/20210104170939.GH9712@telsasoft.com Discussion: https://postgr.es/m/CANfkH5k-6nNt-4cSv1vPB80nq2BZCzhFVR5O4VznYbsX0wZmow@mail.gmail.com
* Remove server and libpq support for old FE/BE protocol version 2.Heikki Linnakangas2021-03-04
| | | | | | | | | | | | | | | | | Protocol version 3 was introduced in PostgreSQL 7.4. There shouldn't be many clients or servers left out there without version 3 support. But as a courtesy, I kept just enough of the old protocol support that we can still send the "unsupported protocol version" error in v2 format, so that old clients can display the message properly. Likewise, libpq still understands v2 ErrorResponse messages when establishing a connection. The impetus to do this now is that I'm working on a patch to COPY FROM, to always prefetch some data. We cannot do that safely with the old protocol, because it requires parsing the input one byte at a time to detect the end-of-copy marker. Reviewed-by: Tom Lane, Alvaro Herrera, John Naylor Discussion: https://www.postgresql.org/message-id/9ec25819-0a8a-d51a-17dc-4150bb3cca3b%40iki.fi
* Use errmsg_internal for debug messagesPeter Eisentraut2021-02-17
| | | | | | An inconsistent set of debug-level messages was not using errmsg_internal(), thus uselessly exposing the messages to translation work. Fix those.
* Allow multiple xacts during table sync in logical replication.Amit Kapila2021-02-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For the initial table data synchronization in logical replication, we use a single transaction to copy the entire table and then synchronize the position in the stream with the main apply worker. There are multiple downsides of this approach: (a) We have to perform the entire copy operation again if there is any error (network breakdown, error in the database operation, etc.) while we synchronize the WAL position between tablesync worker and apply worker; this will be onerous especially for large copies, (b) Using a single transaction in the synchronization-phase (where we can receive WAL from multiple transactions) will have the risk of exceeding the CID limit, (c) The slot will hold the WAL till the entire sync is complete because we never commit till the end. This patch solves all the above downsides by allowing multiple transactions during the tablesync phase. The initial copy is done in a single transaction and after that, we commit each transaction as we receive. To allow recovery after any error or crash, we use a permanent slot and origin to track the progress. The slot and origin will be removed once we finish the synchronization of the table. We also remove slot and origin of tablesync workers if the user performs DROP SUBSCRIPTION .. or ALTER SUBSCRIPTION .. REFERESH and some of the table syncs are still not finished. The commands ALTER SUBSCRIPTION ... REFRESH PUBLICATION and ALTER SUBSCRIPTION ... SET PUBLICATION ... with refresh option as true cannot be executed inside a transaction block because they can now drop the slots for which we have no provision to rollback. This will also open up the path for logical replication of 2PC transactions on the subscriber side. Previously, we can't do that because of the requirement of maintaining a single transaction in tablesync workers. Bump catalog version due to change of state in the catalog (pg_subscription_rel). Author: Peter Smith, Amit Kapila, and Takamichi Osumi Reviewed-by: Ajin Cherian, Petr Jelinek, Hou Zhijie and Amit Kapila Discussion: https://postgr.es/m/CAA4eK1KHJxaZS-fod-0fey=0tq3=Gkn4ho=8N4-5HWiCfu0H1A@mail.gmail.com
* Avoid crash when rolling back within a prepared statement.Tom Lane2021-02-03
| | | | | | | | | | | | | | | | | | | | | | | | | If a portal is used to run a prepared CALL or DO statement that contains a ROLLBACK, PortalRunMulti fails because the portal's statement list gets cleared by the rollback. (Since the grammar doesn't allow CALL/DO in PREPARE, the only easy way to get to this is via extended query protocol, which treats all inputs as prepared statements.) It's difficult to avoid resetting the portal early because of resource-management issues, so work around this by teaching PortalRunMulti to be wary of portal->stmts having suddenly become NIL. The crash has only been seen to occur in v13 and HEAD (as a consequence of commit 1cff1b95a having added an extra touch of portal->stmts). But even before that, the code involved touching a List that the portal no longer has any claim on. In the test case at hand, the List will still exist because of another refcount on the cached plan; but I'm far from convinced that it's impossible for the cached plan to have been dropped by the time control gets back to PortalRunMulti. Hence, backpatch to v11 where nested transactions were added. Thomas Munro and Tom Lane, per bug #16811 from James Inform Discussion: https://postgr.es/m/16811-c1b599b2c6c2d622@postgresql.org
* Improve performance of repeated CALLs within plpgsql procedures.Tom Lane2021-01-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
* Refactor option handling of CLUSTER, REINDEX and VACUUMMichael Paquier2021-01-18
| | | | | | | | | | | | | | | | | | | | | | This continues the work done in b5913f6. All the options of those commands are changed to use hex values rather than enums to reduce the risk of compatibility bugs when introducing new options. Each option set is moved into a new structure that can be extended with more non-boolean options (this was already the case of VACUUM). The code of REINDEX is restructured so as manual REINDEX commands go through a single routine from utility.c, like VACUUM, to ease the allocation handling of option parameters when a command needs to go through multiple transactions. This can be used as a base infrastructure for future patches related to those commands, including reindex filtering and tablespace support. Per discussion with people mentioned below, as well as Alvaro Herrera and Peter Eisentraut. Author: Michael Paquier, Justin Pryzby Reviewed-by: Alexey Kondratov, Justin Pryzby Discussion: https://postgr.es/m/X8riynBLwxAD9uKk@paquier.xyz
* Add pg_stat_database counters for sessions and session timeMagnus Hagander2021-01-17
| | | | | | | | | | | | | | | This add counters for number of sessions, the different kind of session termination types, and timers for how much time is spent in active vs idle in a database to pg_stat_database. Internally this also renames the parameter "force" to disconnect. This was the only use-case for the parameter before, so repurposing it to this mroe narrow usecase makes things cleaner than inventing something new. Author: Laurenz Albe Reviewed-By: Magnus Hagander, Soumyadeep Chakraborty, Masahiro Ikeda Discussion: https://postgr.es/m/b07e1f9953701b90c66ed368656f2aef40cac4fb.camel@cybertec.at
* Further second thoughts about idle_session_timeout patch.Tom Lane2021-01-07
| | | | | | | | | | | On reflection, the order of operations in PostgresMain() is wrong. These timeouts ought to be shut down before, not after, we do the post-command-read CHECK_FOR_INTERRUPTS, to guarantee that any timeout error will be detected there rather than at some ill-defined later point (possibly after having wasted a lot of work). This is really an error in the original idle_in_transaction_timeout patch, so back-patch to 9.6 where that was introduced.
* Add idle_session_timeout.Tom Lane2021-01-06
| | | | | | | | | | | | This GUC variable works much like idle_in_transaction_session_timeout, in that it kills sessions that have waited too long for a new client query. But it applies when we're not in a transaction, rather than when we are. Li Japin, reviewed by David Johnston and Hayato Kuroda, some fixes by me Discussion: https://postgr.es/m/763A0689-F189-459E-946F-F0EC4458980B@hotmail.com
* Detect the deadlocks between backends and the startup process.Fujii Masao2021-01-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The deadlocks that the recovery conflict on lock is involved in can happen between hot-standby backends and the startup process. If a backend takes an access exclusive lock on the table and which finally triggers the deadlock, that deadlock can be detected as expected. On the other hand, previously, if the startup process took an access exclusive lock and which finally triggered the deadlock, that deadlock could not be detected and could remain even after deadlock_timeout passed. This is a bug. The cause of this bug was that the code for handling the recovery conflict on lock didn't take care of deadlock case at all. It assumed that deadlocks involving the startup process and backends were able to be detected by the deadlock detector invoked within backends. But this assumption was incorrect. The startup process also should have invoked the deadlock detector if necessary. To fix this bug, this commit makes the startup process invoke the deadlock detector if deadlock_timeout is reached while handling the recovery conflict on lock. Specifically, in that case, the startup process requests all the backends holding the conflicting locks to check themselves for deadlocks. Back-patch to v9.6. v9.5 has also this bug, but per discussion we decided not to back-patch the fix to v9.5. Because v9.5 doesn't have some infrastructure codes (e.g., 37c54863cf) that this bug fix patch depends on. We can apply those codes for the back-patch, but since the next minor version release is the final one for v9.5, it's risky to do that. If we unexpectedly introduce new bug to v9.5 by the back-patch, there is no chance to fix that. We determined that the back-patch to v9.5 would give more risk than gain. Author: Fujii Masao Reviewed-by: Bertrand Drouvot, Masahiko Sawada, Kyotaro Horiguchi Discussion: https://postgr.es/m/4041d6b6-cf24-a120-36fa-1294220f8243@oss.nttdata.com
* Re-implement pl/pgsql's expression and assignment parsing.Tom Lane2021-01-04
| | | | | | | | | | | | | | | | | | | | | | | | | | Invent new RawParseModes that allow the core grammar to handle pl/pgsql expressions and assignments directly, and thereby get rid of a lot of hackery in pl/pgsql's parser. This moves a good deal of knowledge about pl/pgsql into the core code: notably, we have to invent a CoercionContext that matches pl/pgsql's (rather dubious) historical behavior for assignment coercions. That's getting away from the original idea of pl/pgsql as an arm's-length extension of the core, but really we crossed that bridge a long time ago. The main advantage of doing this is that we can now use the core parser to generate FieldStore and/or SubscriptingRef nodes to handle assignments to pl/pgsql variables that are records or arrays. That fixes a number of cases that had never been implemented in pl/pgsql assignment, such as nested records and array slicing, and it allows pl/pgsql assignment to support the datatype-specific subscripting behaviors introduced in commit c7aba7c14. There are cosmetic benefits too: when a syntax error occurs in a pl/pgsql expression, the error report no longer includes the confusing "SELECT" keyword that used to get prefixed to the expression text. Also, there seem to be some small speed gains. Discussion: https://postgr.es/m/4165684.1607707277@sss.pgh.pa.us