aboutsummaryrefslogtreecommitdiff
path: root/src/backend/storage/ipc
Commit message (Collapse)AuthorAge
...
* Add debugging help in OwnLatch().Thomas Munro2022-05-31
| | | | | | | | | Build farm animal gharial recently failed a few times in a parallel worker's call to OwnLatch() with "ERROR: latch already owned". Let's turn that into a PANIC and show the PID of the owner, to try to learn more. Discussion: https://postgr.es/m/CA%2BhUKGJ_0RGcr7oUNzcHdn7zHqHSB_wLSd3JyS2YC_DYB%2B-V%3Dg%40mail.gmail.com
* Repurpose PROC_COPYABLE_FLAGS as PROC_XMIN_FLAGSAlvaro Herrera2022-05-19
| | | | | | | | | | | | | | | This is a slight, convenient semantics change from what commit 0f0cfb494004 ("Fix parallel operations that prevent oldest xmin from advancing") introduced that lets us simplify the coding in the one place where it is used. Backpatch to 13. This is related to commit 6fea65508a1a ("Tighten ComputeXidHorizons' handling of walsenders") rewriting the code site where this is used, which has not yet been backpatched, but it may well be in the future. Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/202204191637.eldwa2exvguw@alvherre.pgsql
* Clean up newlines following left parenthesesAlvaro Herrera2022-05-13
| | | | Like commit c9d297751959.
* Add a new shmem_request_hook hook.Robert Haas2022-05-13
| | | | | | | | | | | | | | | | | | | | | | Currently, preloaded libraries are expected to request additional shared memory and LWLocks in _PG_init(). However, it is not unusal for such requests to depend on MaxBackends, which won't be initialized at that time. Such requests could also depend on GUCs that other modules might change. This introduces a new hook where modules can safely use MaxBackends and GUCs to request additional shared memory and LWLocks. Furthermore, this change restricts requests for shared memory and LWLocks to this hook. Previously, libraries could make requests until the size of the main shared memory segment was calculated. Unlike before, we no longer silently ignore requests received at invalid times. Instead, we FATAL if someone tries to request additional shared memory or LWLocks outside of the hook. Nathan Bossart and Julien Rouhaud Discussion: https://postgr.es/m/20220412210112.GA2065815%40nathanxps13 Discussion: https://postgr.es/m/Yn2jE/lmDhKtkUdr@paquier.xyz
* Pre-beta mechanical code beautification.Tom Lane2022-05-12
| | | | | Run pgindent, pgperltidy, and reformat-dat-files. I manually fixed a couple of comments that pgindent uglified.
* Add logging for excessive ProcSignalBarrier waits.Thomas Munro2022-05-11
| | | | | | | | | | | | | To enable diagnosis of systems that are not processing ProcSignalBarrier requests promptly, add a LOG message every 5 seconds if we seem to be wedged. Although you could already see this state as a wait event in pg_stat_activity, the log message also shows the PID of the process that is preventing progress. Also add DEBUG1 logging around the whole wait loop. Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/CA%2BTgmoYJ03r5359gQutRGP9BtigYCg3_UskcmnVjBf-QO3-0pQ%40mail.gmail.com
* Fix possibility of self-deadlock in ResolveRecoveryConflictWithBufferPin().Andres Freund2022-05-02
| | | | | | | | | | | | | | | | | | | | | The tests added in 9f8a050f68d failed nearly reliably on FreeBSD in CI, and occasionally on the buildfarm. That turns out to be caused not by a bug in the test, but by a longstanding bug in recovery conflict handling. The standby timeout handler, used by ResolveRecoveryConflictWithBufferPin(), executed SendRecoveryConflictWithBufferPin() inside a signal handler. A bad idea, because the deadlock timeout handler (or a spurious latch set) could have interrupted ProcWaitForSignal(). If unlucky that could cause a self-deadlock on ProcArrayLock, if the deadlock check is in SendRecoveryConflictWithBufferPin()->CancelDBBackends(). To fix, set a flag in StandbyTimeoutHandler(), and check the flag in ResolveRecoveryConflictWithBufferPin(). Subsequently the recovery conflict tests will be backpatched. Discussion: https://postgr.es/m/20220413002626.udl7lll7f3o7nre7@alap3.anarazel.de Backpatch: 10-
* Fix typo in comment.Etsuro Fujita2022-05-02
|
* Tighten ComputeXidHorizons' handling of walsenders.Tom Lane2022-04-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ComputeXidHorizons (nee GetOldestXmin) thought that it could identify walsenders by checking for proc->databaseId == 0. Perhaps that was safe when the code was written, but it's been wrong at least since autovacuum was invented. Background processes that aren't connected to any particular database, such as the autovacuum launcher and logical replication launcher, look like that too. This imprecision is harmful because when such a process advertises an xmin, the result is to hold back dead-tuple cleanup in all databases, though it'd be sufficient to hold it back in shared catalogs (which are the only relations such a process can access). Aside from being generally inefficient, this has recently been seen to cause regression test failures in the buildfarm, as a consequence of the logical replication launcher's startup transaction preventing VACUUM from marking pages of a user table as all-visible. We only want that global hold-back effect for the case where a walsender is advertising a hot standby feedback xmin. Therefore, invent a new PGPROC flag that says that a process' xmin should be considered globally, and check that instead of using the incorrect databaseId == 0 test. Currently only a walsender sets that flag, and only if it is not connected to any particular database. (This is for bug-compatibility with the undocumented behavior of the existing code, namely that feedback sent by a client who has connected to a particular database would not be applied globally. I'm not sure this is a great definition; however, such a client is capable of issuing plain SQL commands, and I don't think we want xmins advertised for such commands to be applied globally. Perhaps this could do with refinement later.) While at it, I rewrote the comment in ComputeXidHorizons, and re-ordered the commented-upon if-tests, to make them match up for intelligibility's sake. This is arguably a back-patchable bug fix, but given the lack of complaints I think it prudent to let it age awhile in HEAD first. Discussion: https://postgr.es/m/1346227.1649887693@sss.pgh.pa.us
* Remove extraneous blank lines before block-closing bracesAlvaro Herrera2022-04-13
| | | | | | | | | These are useless and distracting. We wouldn't have written the code with them to begin with, so there's no reason to keep them. Author: Justin Pryzby <pryzby@telsasoft.com> Discussion: https://postgr.es/m/20220411020336.GB26620@telsasoft.com Discussion: https://postgr.es/m/attachment/133167/0016-Extraneous-blank-lines.patch
* Revert the addition of GetMaxBackends() and related stuff.Robert Haas2022-04-12
| | | | | | | | | | | | This reverts commits 0147fc7, 4567596, aa64f23, and 5ecd018. There is no longer agreement that introducing this function was the right way to address the problem. The consensus now seems to favor trying to make a correct value for MaxBackends available to mdules executing their _PG_init() functions. Nathan Bossart Discussion: http://postgr.es/m/20220323045229.i23skfscdbvrsuxa@jrouhaud
* Fix various typos and spelling mistakes in code commentsDavid Rowley2022-04-11
| | | | | Author: Justin Pryzby Discussion: https://postgr.es/m/20220411020336.GB26620@telsasoft.com
* Rename delayChkpt to delayChkptFlags.Robert Haas2022-04-08
| | | | | | | | | | | | | | | | | | Before commit 412ad7a55639516f284cd0ef9757d6ae5c7abd43, delayChkpt was a Boolean. Now it's an integer. Extensions using it need to be appropriately updated, so let's rename the field to make sure that a hard compilation failure occurs. Replacing delayChkpt with delayChkptFlags made a few comments extend past 80 characters, so I reflowed them and changed some wording very slightly. The back-branches will need a different change to restore compatibility with existing minor releases; this is just for master. Per suggestion from Tom Lane. Discussion: http://postgr.es/m/a7880f4d-1d74-582a-ada7-dad168d046d1@enterprisedb.com
* Prefetch data referenced by the WAL, take II.Thomas Munro2022-04-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a new GUC recovery_prefetch. When enabled, look ahead in the WAL and try to initiate asynchronous reading of referenced data blocks that are not yet cached in our buffer pool. For now, this is done with posix_fadvise(), which has several caveats. Since not all OSes have that system call, "try" is provided so that it can be enabled where available. Better mechanisms for asynchronous I/O are possible in later work. Set to "try" for now for test coverage. Default setting to be finalized before release. The GUC wal_decode_buffer_size limits the distance we can look ahead in bytes of decoded data. The existing GUC maintenance_io_concurrency is used to limit the number of concurrent I/Os allowed, based on pessimistic heuristics used to infer that I/Os have begun and completed. We'll also not look more than maintenance_io_concurrency * 4 block references ahead. Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Reviewed-by: Tomas Vondra <tomas.vondra@2ndquadrant.com> Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com> (earlier version) Reviewed-by: Andres Freund <andres@anarazel.de> (earlier version) Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> (earlier version) Tested-by: Tomas Vondra <tomas.vondra@2ndquadrant.com> (earlier version) Tested-by: Jakub Wartak <Jakub.Wartak@tomtom.com> (earlier version) Tested-by: Dmitry Dolgov <9erthalion6@gmail.com> (earlier version) Tested-by: Sait Talha Nisanci <Sait.Nisanci@microsoft.com> (earlier version) Discussion: https://postgr.es/m/CA%2BhUKGJ4VJN8ttxScUFM8dOKX0BrBiboo5uz1cq%3DAovOddfHpA%40mail.gmail.com
* pgstat: store statistics in shared memory.Andres Freund2022-04-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously the statistics collector received statistics updates via UDP and shared statistics data by writing them out to temporary files regularly. These files can reach tens of megabytes and are written out up to twice a second. This has repeatedly prevented us from adding additional useful statistics. Now statistics are stored in shared memory. Statistics for variable-numbered objects are stored in a dshash hashtable (backed by dynamic shared memory). Fixed-numbered stats are stored in plain shared memory. The header for pgstat.c contains an overview of the architecture. The stats collector is not needed anymore, remove it. By utilizing the transactional statistics drop infrastructure introduced in a prior commit statistics entries cannot "leak" anymore. Previously leaked statistics were dropped by pgstat_vacuum_stat(), called from [auto-]vacuum. On systems with many small relations pgstat_vacuum_stat() could be quite expensive. Now that replicas drop statistics entries for dropped objects, it is not necessary anymore to reset stats when starting from a cleanly shut down replica. Subsequent commits will perform some further code cleanup, adapt docs and add tests. Bumps PGSTAT_FILE_FORMAT_ID. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Author: Andres Freund <andres@anarazel.de> Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-By: Andres Freund <andres@anarazel.de> Reviewed-By: Thomas Munro <thomas.munro@gmail.com> Reviewed-By: Justin Pryzby <pryzby@telsasoft.com> Reviewed-By: "David G. Johnston" <david.g.johnston@gmail.com> Reviewed-By: Tomas Vondra <tomas.vondra@2ndquadrant.com> (in a much earlier version) Reviewed-By: Arthur Zakirov <a.zakirov@postgrespro.ru> (in a much earlier version) Reviewed-By: Antonin Houska <ah@cybertec.at> (in a much earlier version) Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de Discussion: https://postgr.es/m/20220308205351.2xcn6k4x5yivcxyd@alap3.anarazel.de Discussion: https://postgr.es/m/20210319235115.y3wz7hpnnrshdyv6@alap3.anarazel.de
* dsm: allow use in single user mode.Andres Freund2022-04-06
| | | | | | | | | | | | It might seem pointless to allow use of dsm in single user mode, but otherwise subsystems might need dedicated single user mode code paths. Besides changing the assert, all that's needed is to make some windows code assuming the presence of postmaster conditional. Author: Andres Freund <andres@anarazel.de> Reviewed-By: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/CA+hUKGL9hY_VY=+oUK+Gc1iSRx-Ls5qeYJ6q=dQVZnT3R63Taw@mail.gmail.com
* Explain why the startup process can't cause a shortage of sinval slots.Robert Haas2022-03-29
| | | | | | | Bharath Rupireddy, reviewed by Fujii Masao and Yura Sokolov. Lightly edited by me. Discussion: http://postgr.es/m/CALj2ACU=3_frMkDp9UUeuZoAMjaK1y0Z_q5RFNbGvwi8NM==AA@mail.gmail.com
* Fix typos in standby.cMichael Paquier2022-03-25
| | | | | | | xl_running_xacts exists, not xl_xact_running_xacts. Author: Hou Zhijie Discussion: https://postgr.es/m/OS0PR01MB57160D8B01097FFB5C175065941A9@OS0PR01MB5716.jpnprd01.prod.outlook.com
* Fix possible recovery trouble if TRUNCATE overlaps a checkpoint.Robert Haas2022-03-24
| | | | | | | | | | | | | | | | If TRUNCATE causes some buffers to be invalidated and thus the checkpoint does not flush them, TRUNCATE must also ensure that the corresponding files are truncated on disk. Otherwise, a replay from the checkpoint might find that the buffers exist but have the wrong contents, which may cause replay to fail. Report by Teja Mupparti. Patch by Kyotaro Horiguchi, per a design suggestion from Heikki Linnakangas, with some changes to the comments by me. Review of this and a prior patch that approached the issue differently by Heikki Linnakangas, Andres Freund, Álvaro Herrera, Masahiko Sawada, and Tom Lane. Discussion: http://postgr.es/m/BYAPR06MB6373BF50B469CA393C614257ABF00@BYAPR06MB6373.namprd06.prod.outlook.com
* Create routine able to set single-call SRFs for Materialize modeMichael Paquier2022-03-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set-returning functions that use the Materialize mode, creating a tuplestore to include all the tuples returned in a set rather than doing so in multiple calls, use roughly the same set of steps to prepare ReturnSetInfo for this job: - Check if ReturnSetInfo supports returning a tuplestore and if the materialize mode is enabled. - Create a tuplestore for all the tuples part of the returned set in the per-query memory context, stored in ReturnSetInfo->setResult. - Build a tuple descriptor mostly from get_call_result_type(), then stored in ReturnSetInfo->setDesc. Note that there are some cases where the SRF's tuple descriptor has to be the one specified by the function caller. This refactoring is done so as there are (well, should be) no behavior changes in any of the in-core functions refactored, and the centralized function that checks and sets up the function's ReturnSetInfo can be controlled with a set of bits32 options. Two of them prove to be necessary now: - SRF_SINGLE_USE_EXPECTED to use expectedDesc as tuple descriptor, as expected by the function's caller. - SRF_SINGLE_BLESS to validate the tuple descriptor for the SRF. The same initialization pattern is simplified in 28 places per my count as of src/backend/, shaving up to ~900 lines of code. These mostly come from the removal of the per-query initializations and the sanity checks now grouped in a single location. There are more locations that could be simplified in contrib/, that are left for a follow-up cleanup. fcc2817, 07daca5 and d61a361 have prepared the areas of the code related to this change, to ease this refactoring. Author: Melanie Plageman, Michael Paquier Reviewed-by: Álvaro Herrera, Justin Pryzby Discussion: https://postgr.es/m/CAAKRu_azyd1Z3W_r7Ou4sorTjRCs+PxeHw1CWJeXKofkE6TuZg@mail.gmail.com
* Clean up assorted failures under clang's -fsanitize=undefined checks.Tom Lane2022-03-03
| | | | | | | | | | | | | | | | | | | | | | Most of these are cases where we could call memcpy() or other libc functions with a NULL pointer and a zero count, which is forbidden by POSIX even though every production version of libc allows it. We've fixed such things before in a piecemeal way, but apparently never made an effort to try to get them all. I don't claim that this patch does so either, but it gets every failure I observe in check-world, using clang 12.0.1 on current RHEL8. numeric.c has a different issue that the sanitizer doesn't like: "ln(-1.0)" will compute log10(0) and then try to assign the resulting -Inf to an integer variable. We don't actually use the result in such a case, so there's no live bug. Back-patch to all supported branches, with the idea that we might start running a buildfarm member that tests this case. This includes back-patching c1132aae3 (Check the size in COPY_POINTER_FIELD), which previously silenced some of these issues in copyfuncs.c. Discussion: https://postgr.es/m/CALNJ-vT9r0DSsAOw9OXVJFxLENoVS_68kJ5x0p44atoYH+H4dg@mail.gmail.com
* Remove all traces of tuplestore_donestoring() in the C codeMichael Paquier2022-02-17
| | | | | | | | | | | | | | | | | | This routine is a no-op since dd04e95 from 2003, with a macro kept around for compatibility purposes. This has led to the same code patterns being copy-pasted around for no effect, sometimes in confusing ways like in pg_logical_slot_get_changes_guts() from logical.c where the code was actually incorrect. This issue has been discussed on two different threads recently, so rather than living with this legacy, remove any uses of this routine in the C code to simplify things. The compatibility macro is kept to avoid breaking any out-of-core modules that depend on it. Reported-by: Tatsuhito Kasahara, Justin Pryzby Author: Tatsuhito Kasahara Discussion: https://postgr.es/m/20211217200419.GQ17618@telsasoft.com Discussion: https://postgr.es/m/CAP0=ZVJeeYfAeRfmzqAF2Lumdiv4S4FewyBnZd4DPTrsSQKJKw@mail.gmail.com
* Split xlog.c into xlog.c and xlogrecovery.c.Heikki Linnakangas2022-02-16
| | | | | | | | | | | This moves the functions related to performing WAL recovery into the new xlogrecovery.c source file, leaving xlog.c responsible for maintaining the WAL buffers, coordinating the startup and switch from recovery to normal operations, and other miscellaneous stuff that have always been in xlog.c. Reviewed-by: Andres Freund, Kyotaro Horiguchi, Robert Haas Discussion: https://www.postgresql.org/message-id/a31f27b4-a31d-f976-6217-2b03be646ffa%40iki.fi
* Add WL_SOCKET_CLOSED for socket shutdown events.Thomas Munro2022-02-14
| | | | | | | | | | | | | | | Provide a way for WaitEventSet to report that the remote peer has shut down its socket, independently of whether there is any buffered data remaining to be read. This works only on systems where the kernel exposes that information, namely: * WAIT_USE_POLL builds using POLLRDHUP, if available * WAIT_USE_EPOLL builds using EPOLLRDHUP * WAIT_USE_KQUEUE builds using EV_EOF Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Reviewed-by: Maksim Milyutin <milyutinma@gmail.com> Discussion: https://postgr.es/m/77def86b27e41f0efcba411460e929ae%40postgrespro.ru
* Fix DROP {DATABASE,TABLESPACE} on Windows.Thomas Munro2022-02-12
| | | | | | | | | | | | | | | | | | | | | | | | | Previously, it was possible for DROP DATABASE, DROP TABLESPACE and ALTER DATABASE SET TABLESPACE to fail because other backends still had file handles open for dropped tables. Windows won't allow a directory containing unlinked-but-still-open files to be unlinked. Tackle this problem by forcing all backends to close all smgr fds. No change for Unix systems, which don't suffer from the problem, but the new code path can be tested by Unix-based developers by defining USE_BARRIER_SMGRRELEASE explicitly. It's possible that PROCSIGNAL_BARRIER_SMGRRELEASE will have more bug-fixing applications soon (under discussion). Note that this is the first user of the ProcSignalBarrier mechanism from commit 16a4e4aec. It could in principle be back-patched as far as 14, but since field complaints are rare and ProcSignalBarrier hasn't been battle-tested, that seems like a bad idea. Fix in master only, where these failures have started to show up in automated testing due to new tests. Suggested-by: Andres Freund <andres@anarazel.de> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/CA+hUKGLdemy2gBm80kz20GTe6hNVwoErE8KwcJk6-U56oStjtg@mail.gmail.com
* Test honestly for <sys/signalfd.h>.Tom Lane2022-02-09
| | | | | | | | | | | | | | | Commit 6a2a70a02 supposed that any platform having <sys/epoll.h> would also have <sys/signalfd.h>. It turns out there are still a few people using platforms where that's not so, so we'd better make a separate configure probe for it. But since it took this long to notice, I'm content with the decision to not have a separate code path for epoll-only machines; we'll just fall back to using poll() for these stragglers. Per gripe from Gabriela Serventi. Back-patch to v14 where this code came in. Discussion: https://postgr.es/m/CAHOHWE-JjJDfcYuLAAEO7Jk07atFAU47z8TzHzg71gbC0aMy=g@mail.gmail.com
* Remove MaxBackends variable in favor of GetMaxBackends() function.Robert Haas2022-02-08
| | | | | | | | | | | | | | Previously, it was really easy to write code that accessed MaxBackends before we'd actually initialized it, especially when coding up an extension. To make this less error-prune, introduce a new function GetMaxBackends() which should be used to obtain the correct value. This will ERROR if called too early. Demote the global variable to a file-level static, so that nobody can peak at it directly. Nathan Bossart. Idea by Andres Freund. Review by Greg Sabino Mullane, by Michael Paquier (who had doubts about the approach), and by me. Discussion: http://postgr.es/m/20210802224204.bckcikl45uezv5e4@alap3.anarazel.de
* Fix ordering of XIDs in ProcArrayApplyRecoveryInfoTomas Vondra2022-01-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 8431e296ea reworked ProcArrayApplyRecoveryInfo to sort XIDs before adding them to KnownAssignedXids. But the XIDs are sorted using xidComparator, which compares the XIDs simply as uint32 values, not logically. KnownAssignedXidsAdd() however expects XIDs in logical order, and calls TransactionIdFollowsOrEquals() to enforce that. If there are XIDs for which the two orderings disagree, an error is raised and the recovery fails/restarts. Hitting this issue is fairly easy - you just need two transactions, one started before the 4B limit (e.g. XID 4294967290), the other sometime after it (e.g. XID 1000). Logically (4294967290 <= 1000) but when compared using xidComparator we try to add them in the opposite order. Which makes KnownAssignedXidsAdd() fail with an error like this: ERROR: out-of-order XID insertion in KnownAssignedXids This only happens during replica startup, while processing RUNNING_XACTS records to build the snapshot. Once we reach STANDBY_SNAPSHOT_READY, we skip these records. So this does not affect already running replicas, but if you restart (or create) a replica while there are transactions with XIDs for which the two orderings disagree, you may hit this. Long-running transactions and frequent replica restarts increase the likelihood of hitting this issue. Once the replica gets into this state, it can't be started (even if the old transactions are terminated). Fixed by sorting the XIDs logically - this is fine because we're dealing with normal XIDs (because it's XIDs assigned to backends) and from the same wraparound epoch (otherwise the backends could not be running at the same time on the primary node). So there are no problems with the triangle inequality, which is why xidComparator compares raw values. Investigation and root cause analysis by Abhijit Menon-Sen. Patch by me. This issue is present in all releases since 9.4, however releases up to 9.6 are EOL already so backpatch to 10 only. Reviewed-by: Abhijit Menon-Sen Reviewed-by: Alvaro Herrera Backpatch-through: 10 Discussion: https://postgr.es/m/36b8a501-5d73-277c-4972-f58a4dce088a%40enterprisedb.com
* Consistently use the function name CreateCheckPoint in code and comments.Amit Kapila2022-01-17
| | | | | Author: Bharath Rupireddy Discussion: https://postgr.es/m/CALj2ACVZmKsvDjtd45+9oTcnjUJtC4LF2BYK8TpWT1f=NjJX3w@mail.gmail.com
* Improve warning message in pg_signal_backend()John Naylor2022-01-11
| | | | | | | | | | | | Previously, invoking pg_terminate_backend() or pg_cancel_backend() with the postmaster PID produced a "PID XXXX is not a PostgresSQL server process" warning, which does not make sense. Change to "backend process" to make the message more exact. Nathan Bossart, based on an idea from Bharath Rupireddy with input from Tom Lane and Euler Taveira Discussion: https://www.postgresql.org/message-id/flat/CALj2ACW7Rr-R7mBcBQiXWPp=JV5chajjTdudLiF5YcpW-BmHhg@mail.gmail.com
* Update copyright for 2022Bruce Momjian2022-01-07
| | | | Backpatch-through: 10
* Replace random(), pg_erand48(), etc with a better PRNG API and algorithm.Tom Lane2021-11-28
| | | | | | | | | | | | | | | | | | | Standardize on xoroshiro128** as our basic PRNG algorithm, eliminating a bunch of platform dependencies as well as fundamentally-obsolete PRNG code. In addition, this API replacement will ease replacing the algorithm again in future, should that become necessary. xoroshiro128** is a few percent slower than the drand48 family, but it can produce full-width 64-bit random values not only 48-bit, and it should be much more trustworthy. It's likely to be noticeably faster than the platform's random(), depending on which platform you are thinking about; and we can have non-global state vectors easily, unlike with random(). It is not cryptographically strong, but neither are the functions it replaces. Fabien Coelho, reviewed by Dean Rasheed, Aleksander Alekseev, and myself Discussion: https://postgr.es/m/alpine.DEB.2.22.394.2105241211230.165418@pseudo
* Fix parallel operations that prevent oldest xmin from advancing.Amit Kapila2021-11-19
| | | | | | | | | | | | | | | | | While determining xid horizons, we skip over backends that are running Vacuum. We also ignore Create Index Concurrently, or Reindex Concurrently for the purposes of computing Xmin for Vacuum. But we were not setting the flags corresponding to these operations when they are performed in parallel which was preventing Xid horizon from advancing. The optimization related to skipping Create Index Concurrently, or Reindex Concurrently operations was implemented in PG-14 but the fix is the same for the Parallel Vacuum as well so back-patched till PG-13. Author: Masahiko Sawada Reviewed-by: Amit Kapila Backpatch-through: 13 Discussion: https://postgr.es/m/CAD21AoCLQqgM1sXh9BrDFq0uzd3RBFKi=Vfo6cjjKODm0Onr5w@mail.gmail.com
* Make some comments use the term "ProcSignal" for consistencyMichael Paquier2021-11-09
| | | | | | | | The surroundings in procsignal.c prefer using "ProcSignal" rather than "procsignal". Author: Bharath Rupireddy Discussion: https://postgr.es/m/CALj2ACX99ghPmm1M_O4r4g+YsXFjCn=qF7PeDXntLwMpht_Gdg@mail.gmail.com
* Reset lastOverflowedXid on standby when neededAlexander Korotkov2021-11-06
| | | | | | | | | | | | | | | | | Currently, lastOverflowedXid is never reset. It's just adjusted on new transactions known to be overflowed. But if there are no overflowed transactions for a long time, snapshots could be mistakenly marked as suboverflowed due to wraparound. This commit fixes this issue by resetting lastOverflowedXid when needed altogether with KnownAssignedXids. Backpatch to all supported versions. Reported-by: Stan Hu Discussion: https://postgr.es/m/CAMBWrQ%3DFp5UAsU_nATY7EMY7NHczG4-DTDU%3DmCvBQZAQ6wa2xQ%40mail.gmail.com Author: Kyotaro Horiguchi, Alexander Korotkov Reviewed-by: Stan Hu, Simon Riggs, Nikolay Samokhvalov, Andrey Borodin, Dmitry Dolgov
* Avoid O(N^2) behavior when the standby process releases many locks.Tom Lane2021-10-31
| | | | | | | | | | | | | | | | | When replaying a transaction that held many exclusive locks on the primary, a standby server's startup process would expend O(N^2) effort on manipulating the list of locks. This code was fine when written, but commit 1cff1b95a made repetitive list_delete_first() calls inefficient, as explained in its commit message. Fix by just iterating the list normally, and releasing storage only when done. (This'd be inadequate if we needed to recover from an error occurring partway through; but we don't.) Back-patch to v13 where 1cff1b95a came in. Nathan Bossart Discussion: https://postgr.es/m/CD2F0E7F-9822-45EC-A411-AE56F14DEA9F@amazon.com
* shm_mq: Update mq_bytes_written less often.Robert Haas2021-10-14
| | | | | | | | | | | | | | Do not update shm_mq's mq_bytes_written until we have written an amount of data greater than 1/4th of the ring size, unless the caller of shm_mq_send(v) requests a flush at the end of the message. This reduces the number of calls to SetLatch(), and also the number of CPU cache misses, considerably, and thus makes shm_mq significantly faster. Dilip Kumar, reviewed by Zhihong Yu and Tomas Vondra. Some minor cosmetic changes by me. Discussion: http://postgr.es/m/CAFiTN-tVXqn_OG7tHNeSkBbN+iiCZTiQ83uakax43y1sQb2OBA@mail.gmail.com
* Fix duplicate words in commentsDaniel Gustafsson2021-10-04
| | | | | | | Remove accidentally duplicated words in code comments. Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://postgr.es/m/87bl45t0co.fsf@wibble.ilmari.org
* Introduce GUC shared_memory_size_in_huge_pagesMichael Paquier2021-09-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This runtime-computed GUC shows the number of huge pages required for the server's main shared memory area, taking advantage of the work done in 0c39c29 and 0bd305e. This is useful for users to estimate the amount of huge pages required for a server as it becomes possible to do an estimation without having to start the server and potentially allocate a large chunk of shared memory. The number of huge pages is calculated based on the existing GUC huge_page_size if set, or by using the system's default by looking at /proc/meminfo on Linux. There is nothing new here as this commit reuses the existing calculation methods, and just exposes this information directly to the user. The routine calculating the huge page size is refactored to limit the number of files with platform-specific flags. This new GUC's name was the most popular choice based on the discussion done. This is only supported on Linux. I have taken the time to test the change on Linux, Windows and MacOS, though for the last two ones large pages are not supported. The first one calculates correctly the number of pages depending on the existing GUC huge_page_size or the system's default. Thanks to Andres Freund, Robert Haas, Kyotaro Horiguchi, Tom Lane, Justin Pryzby (and anybody forgotten here) for the discussion. Author: Nathan Bossart Discussion: https://postgr.es/m/F2772387-CE0F-46BF-B5F1-CC55516EB885@amazon.com
* Fix variable shadowing in procarray.c.Fujii Masao2021-09-16
| | | | | | | | | | | | | ProcArrayGroupClearXid function has a parameter named "proc", but the same name was used for its local variables. This commit fixes this variable shadowing, to improve code readability. Back-patch to all supported versions, to make future back-patching easy though this patch is classified as refactoring only. Reported-by: Ranier Vilela Author: Ranier Vilela, Aleksander Alekseev https://postgr.es/m/CAEudQAqyoTZC670xWi6w-Oe2_Bk1bfu2JzXz6xRfiOUzm7xbyQ@mail.gmail.com
* Use int instead of size_t in procarray.c.Fujii Masao2021-09-16
| | | | | | | | | | | | | | All size_t variables declared in procarray.c are actually int ones. Let's use int instead of size_t for those variables. Which would reduce Wsign-compare compiler warnings. Back-patch to v14 where commit 941697c3c1 added size_t variables in procarray.c, to make future back-patching easy though this patch is classified as refactoring only. Reported-by: Ranier Vilela Author: Ranier Vilela, Aleksander Alekseev https://postgr.es/m/CAEudQAqyoTZC670xWi6w-Oe2_Bk1bfu2JzXz6xRfiOUzm7xbyQ@mail.gmail.com
* Fix compilation warning in ipci.cMichael Paquier2021-09-08
| | | | | | A Size had better use %zu when printed. Oversight in bd17880, per buildfarm member lapwing.
* Introduce GUC shared_memory_sizeMichael Paquier2021-09-08
| | | | | | | | | | This runtime-computed GUC shows the size of the server's main shared memory area, taking into account the amount of shared memory allocated by extensions as this is calculated after processing shared_preload_libraries. Author: Nathan Bossart Discussion: https://postgr.es/m/F2772387-CE0F-46BF-B5F1-CC55516EB885@amazon.com
* Move the shared memory size calculation to its own functionMichael Paquier2021-09-06
| | | | | | | | | | This change refactors the shared memory size calculation in CreateSharedMemoryAndSemaphores() to its own function. This is intended for use in a future change related to the setup of huge pages and shared memory with some GUCs, while useful on its own for extensions. Author: Nathan Bossart Discussion: https://postgr.es/m/F2772387-CE0F-46BF-B5F1-CC55516EB885@amazon.com
* Move InRecovery and standbyState global vars to xlogutils.c.Heikki Linnakangas2021-07-31
| | | | | | | | | | They are used in code that runs both during normal operation and during WAL replay, and needs to behave differently during replay. Move them to xlogutils.c, because that's where we have other helper functions used by redo routines. Reviewed-by: Andres Freund Discussion: https://www.postgresql.org/message-id/b3b71061-4919-e882-4857-27e370ab134a%40iki.fi
* Get rid of artificial restriction on hash table sizes on Windows.Tom Lane2021-07-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The point of introducing the hash_mem_multiplier GUC was to let users reproduce the old behavior of hash aggregation, i.e. that it could use more than work_mem at need. However, the implementation failed to get the job done on Win64, where work_mem is clamped to 2GB to protect various places that calculate memory sizes using "long int". As written, the same clamp was applied to hash_mem. This resulted in severe performance regressions for queries requiring a bit more than 2GB for hash aggregation, as they now spill to disk and there's no way to stop that. Getting rid of the work_mem restriction seems like a good idea, but it's a big job and could not conceivably be back-patched. However, there's only a fairly small number of places that are concerned with the hash_mem value, and it turns out to be possible to remove the restriction there without too much code churn or any ABI breaks. So, let's do that for now to fix the regression, and leave the larger task for another day. This patch does introduce a bit more infrastructure that should help with the larger task, namely pg_bitutils.h support for working with size_t values. Per gripe from Laurent Hasson. Back-patch to v13 where the behavior change came in. Discussion: https://postgr.es/m/997817.1627074924@sss.pgh.pa.us Discussion: https://postgr.es/m/MN2PR15MB25601E80A9B6D1BA6F592B1985E39@MN2PR15MB2560.namprd15.prod.outlook.com
* Deduplicate choice of horizon for a relation procarray.c.Andres Freund2021-07-24
| | | | | | | | | | | | | | | | 5a1e1d83022 was a minimal bug fix for dc7420c2c92. To avoid future bugs of that kind, deduplicate the choice of a relation's horizon into a new helper, GlobalVisHorizonKindForRel(). As the code in question was only introduced in dc7420c2c92 it seems worth backpatching this change as well, otherwise 14 will look different from all other branches. A different approach to this was suggested by Matthias van de Meent. Author: Andres Freund Discussion: https://postgr.es/m/20210621122919.2qhu3pfugxxp3cji@alap3.anarazel.de Backpatch: 14, like 5a1e1d83022
* Fix lack of message pluralizationPeter Eisentraut2021-07-14
|
* Remove some dead stores.Thomas Munro2021-07-02
| | | | | | | | | Remove redundant local variable assignments left behind by commit 2fc7af5e966. Author: Quan Zongliang <quanzongliang@yeah.net> Reviewed-by: Jacob Champion <pchampion@vmware.com> Discussion: https://postgr.es/m/de141d14-4fd6-3148-99bf-856b71aa948a%40yeah.net
* Improve various places that double the size of a bufferDavid Rowley2021-07-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Several places were performing a tight loop to determine the first power of 2 number that's > or >= the required memory. Instead of using a loop for that, we can use pg_nextpower2_32 or pg_nextpower2_64. When we need a power of 2 number equal to or greater than a given amount, we just pass the amount to the nextpower2 function. When we need a power of 2 greater than the amount, we just pass the amount + 1. Additionally, in tsearch there were a couple of locations that were performing a while loop when a simple "if" would have done. In both of these locations only 1 item is being added, so the loop could only have ever iterated once. Changing the loop into an if statement makes the code very slightly more optimal as the condition is checked once rather than twice. There are quite a few remaining locations that increase the size of the buffer in the following form: while (reqsize >= buflen) { buflen *= 2; buf = repalloc(buf, buflen); } These are not touched in this commit. repalloc will error out for sizes larger than MaxAllocSize. Changing these to use pg_nextpower2_32 would remove the chance of that error being raised. It's unclear from the code if the sizes could ever become that large, so err on the side of caution. Discussion: https://postgr.es/m/CAApHDvp=tns7RL4PH0ZR0M+M-YFLquK7218x=0B_zO+DbOma+w@mail.gmail.com Reviewed-by: Zhihong Yu