aboutsummaryrefslogtreecommitdiff
path: root/src/backend
Commit message (Collapse)AuthorAge
* Be smarter about freeing tuples during tuplesortsDavid Rowley2022-09-01
| | | | | | | | | | | | | | | | | | | | | | During dumptuples() the call to writetuple() would pfree any non-null tuple. This was quite wasteful as this happens just before we perform a reset of the context which stores all of those tuples. It seems to make sense to do a bit of a code refactor to make this work, so here we just get rid of the writetuple function and adjust the WRITETUP macro to call the state's writetup function. The WRITETUP usage in mergeonerun() always has state->slabAllocatorUsed == true, so writetuple() would never free the tuple or do any memory accounting. The only call path that needs memory accounting done is in dumptuples(), so let's just do it manually there. In passing, let's get rid of the state->memtupcount-- code that counts the memtupcount down to 0 one tuple at a time inside the loop. That seems to be a rather inefficient way to set memtupcount to 0, so let's just zero it after the loop instead. Author: David Rowley Discussion: https://postgr.es/m/CAApHDvqZXoDCyrfCzZJR0-xH+7_q+GgitcQiYXUjRani7h4j8Q@mail.gmail.com
* Prevent long-term memory leakage in autovacuum launcher.Tom Lane2022-08-31
| | | | | | | | | | | | | | | | | | | | | get_database_list() failed to restore the caller's memory context, instead leaving current context set to TopMemoryContext which is how CommitTransactionCommand() leaves it. The callers both think they are using short-lived contexts, for the express purpose of not having to worry about cleaning up individual allocations. The net effect therefore is that supposedly short-lived allocations could accumulate indefinitely in the launcher's TopMemoryContext. Although this has been broken for a long time, it seems we didn't have any obvious memory leak here until v15's rearrangement of the stats logic. I (tgl) am not entirely convinced that there's no other leak at all, though, and we're surely at risk of adding one in future back-patched fixes. So back-patch to all supported branches, even though this may be only a latent bug in pre-v15. Reid Thompson Discussion: https://postgr.es/m/972a4e12b68b0f96db514777a150ceef7dcd2e0f.camel@crunchydata.com
* Derive freeze cutoff from nextXID, not OldestXmin.Peter Geoghegan2022-08-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before now, the cutoffs that VACUUM used to determine which XIDs/MXIDs to freeze were determined at the start of each VACUUM by taking related cutoffs that represent which XIDs/MXIDs VACUUM should treat as still running, and subtracting an XID/MXID age based value controlled by GUCs like vacuum_freeze_min_age. The FreezeLimit cutoff (XID freeze cutoff) was derived by subtracting an XID age value from OldestXmin, while the MultiXactCutoff cutoff (MXID freeze cutoff) was derived by subtracting an MXID age value from OldestMxact. This approach didn't match the approach used nearby to determine whether this VACUUM operation should be an aggressive VACUUM or not. VACUUM now uses the standard approach instead: it subtracts the same age-based values from next XID/next MXID (rather than subtracting from OldestXmin/OldestMxact). This approach is simpler and more uniform. Most of the time it will have only a negligible impact on how and when VACUUM freezes. It will occasionally make VACUUM more robust in the event of problems caused by long running transaction. These are cases where OldestXmin and OldestMxact are held back by so much that they attain an age that is a significant fraction of the value of age-based settings like vacuum_freeze_min_age. There is no principled reason why freezing should be affected in any way by the presence of a long-running transaction -- at least not before the point that the OldestXmin and OldestMxact limits used by each VACUUM operation attain an age that makes it unsafe to freeze some of the XIDs/MXIDs whose age exceeds the value of the relevant age-based settings. The new approach should at least make freezing degrade more gracefully than before, even in the most extreme cases. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Nathan Bossart <nathandbossart@gmail.com> Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/CAH2-WzkOv5CEeyOO=c91XnT5WBR_0gii0Wn5UbZhJ=4TTykDYg@mail.gmail.com
* In the Snowball dictionary, don't try to stem excessively-long words.Tom Lane2022-08-31
| | | | | | | | | | | | | | | | | | | | | If the input word exceeds 1000 bytes, don't pass it to the stemmer; just return it as-is after case folding. Such an input is surely not a word in any human language, so whatever the stemmer might do to it would be pretty dubious in the first place. Adding this restriction protects us against a known recursion-to-stack-overflow problem in the Turkish stemmer, and it seems like good insurance against any other safety or performance issues that may exist in the Snowball stemmers. (I note, for example, that they contain no CHECK_FOR_INTERRUPTS calls, so we really don't want them running for a long time.) The threshold of 1000 bytes is arbitrary. An alternative definition could have been to treat such words as stopwords, but that seems like a bigger break from the old behavior. Per report from Egor Chindyaskin and Alexander Lakhin. Thanks to Olly Betts for the recommendation to fix it this way. Discussion: https://postgr.es/m/1661334672.728714027@f473.i.mail.ru
* Fix a bug in roles_is_member_of.Robert Haas2022-08-31
| | | | | | | | | | | | | | Commit e3ce2de09d814f8770b2e3b3c152b7671bcdb83f rearranged this function to be able to identify which inherited role had admin option on the target role, but it got the order of operations wrong, causing the function to return wrong answers in the presence of non-inherited grants. Fix that, and add a test case that verifies the correct behavior. Patch by me, reviewed by Nathan Bossart Discussion: http://postgr.es/m/CA+TgmoYamnu-xt-u7CqjYWnRiJ6BQaSpYOHXP=r4QGTfd1N_EA@mail.gmail.com
* Drop replication origin slots before tablesync worker exits.Amit Kapila2022-08-30
| | | | | | | | | | | | | | | | | | | Currently, the replication origin tracking of the tablesync worker is dropped by the apply worker. So, there will be a small lag between the tablesync worker exit and its origin tracking got removed. In the meantime, new tablesync workers can be launched and will try to set up a new origin tracking. This can lead the system to reach max configured limit (max_replication_slots) even if the user has configured the max limit considering the number of tablesync workers required in the system. We decided not to back-patch as this can occur in very narrow circumstances and users have to option to increase the configured limit by increasing max_replication_slots. Reported-by: Hubert Depesz Lubaczewski Author: Ajin Cherian Reviwed-by: Masahiko Sawada, Peter Smith, Hou Zhijie, Amit Kapila Discussion: https://postgr.es/m/20220714115155.GA5439@depesz.com
* Adjust comments that called MultiXactIds "XMIDs".Peter Geoghegan2022-08-29
| | | | Oversights in commits 0b018fab and f3c15cbe.
* Use MAXALIGN() in calculations using sizeof(SlabBlock)David Rowley2022-08-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | c6e0fe1f2 added a new pointer field to SlabBlock to make it 4 bytes larger on 32-bit machines. Prior to that commit, the size of that struct was a multiple of 8, which meant that MAXALIGN(sizeof(SlabBlock)) was the same as sizeof(SlabBlock), however, after c6e0fe1f2, due to the addition of the new pointer field to store a pointer to the owning context, that was no longer true on builds with sizeof(void *) == 4. This problem was highlighted by an Assert failure which was checking that the pointer given to pfree() was MAXALIGNED. Various 32-bit ARM buildfarm animals were failing. These have MAXIMUM_ALIGNOF of 8. The only 32-bit testing I'd managed to do on c6e0fe1f2 had been on x86, which has a MAXIMUM_ALIGNOF of 4, therefore did not exhibit this issue. Here we define Slab_BLOCKHDRSZ and copy what is being done in aset.c and generation.c for doing calculations based on the size of the context's block type. This means that SlabAlloc() will now always return a MAXALIGNed pointer. This also fixes an incorrect sentinel_ok() check in SlabCheck() which was incorrectly checking the wrong sentinel byte. This must have previously not caused any issues due to the fullChunkSize never being large enough to store the sentinel byte. Diagnosed-by: Tomas Vondra, Tom Lane Author: Tomas Vondra, David Rowley Discussion: https://postgr.es/m/CAA4eK1%2B1JyW5TiL%3DyV-3Uq1CrfnTyn0Xrk5uArt31Z%3D8rgPhXQ%40mail.gmail.com
* Cleanup more code and comments related to Windows NT4 (XP days)Michael Paquier2022-08-30
| | | | | | | | | | | All the code and comments cleaned up here is irrelevant since 495ed0e. Note that this removes an assumption that CreateRestrictedToken() may not exist, something that could have happened when running under Windows NT as the code stated. Rather than assuming that it may not exist, this causes pg_ctl to fail hard if the function cannot be loaded. Reported-by: Justin Pryzby Discussion: https://postgr.es/m/20220826112637.GD2342@telsasoft.com
* Clean up inconsistent use of fflush().Tom Lane2022-08-29
| | | | | | | | | | | | | | | | | | | | | | More than twenty years ago (79fcde48b), we hacked the postmaster to avoid a core-dump on systems that didn't support fflush(NULL). We've mostly, though not completely, hewed to that rule ever since. But such systems are surely gone in the wild, so in the spirit of cleaning out no-longer-needed portability hacks let's get rid of multiple per-file fflush() calls in favor of using fflush(NULL). Also, we were fairly inconsistent about whether to fflush() before popen() and system() calls. While we've received no bug reports about that, it seems likely that at least some of these call sites are at risk of odd behavior, such as error messages appearing in an unexpected order. Rather than expend a lot of brain cells figuring out which places are at hazard, let's just establish a uniform coding rule that we should fflush(NULL) before these calls. A no-op fflush() is surely of trivial cost compared to launching a sub-process via a shell; while if it's not a no-op then we likely need it. Discussion: https://postgr.es/m/2923412.1661722825@sss.pgh.pa.us
* Prevent WAL corruption after a standby promotion.Robert Haas2022-08-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | When a PostgreSQL instance performing archive recovery but not using standby mode is promoted, and the last WAL segment that it attempted to read ended in a partial record, the previous code would create invalid WAL on the new timeline. The WAL from the previously timeline would be copied to the new timeline up until the end of the last valid record, but instead of beginning to write WAL at immediately afterwards, the promoted server would write an overwrite contrecord at the beginning of the next segment. The end of the previous segment would be left as all-zeroes, resulting in failures if anything tried to read WAL from that file. The root of the issue is that ReadRecord() decides whether to set abortedRecPtr and missingContrecPtr based on the value of StandbyMode, but ReadRecord() switches to a new timeline based on the value of ArchiveRecoveryRequested. We shouldn't try to write an overwrite contrecord if we're switching to a new timeline, so change the test in ReadRecod() to check ArchiveRecoveryRequested instead. Code fix by Dilip Kumar. Comments by me incorporating suggested language from Álvaro Herrera. Further review from Kyotaro Horiguchi and Sami Imseih. Discussion: http://postgr.es/m/CAFiTN-t7umki=PK8dT1tcPV=mOUe2vNhHML6b3T7W7qqvvajjg@mail.gmail.com Discussion: http://postgr.es/m/FB0DEA0B-E14E-43A0-811F-C1AE93D00FF3%40amazon.com
* Improve performance of and reduce overheads of memory managementDavid Rowley2022-08-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Whenever we palloc a chunk of memory, traditionally, we prefix the returned pointer with a pointer to the memory context to which the chunk belongs. This is required so that we're able to easily determine the owning context when performing operations such as pfree() and repalloc(). For the AllocSet context, prior to this commit we additionally prefixed the pointer to the owning context with the size of the chunk. This made the header 16 bytes in size. This 16-byte overhead was required for all AllocSet allocations regardless of the allocation size. For the generation context, the problem was worse; in addition to the pointer to the owning context and chunk size, we also stored a pointer to the owning block so that we could track the number of freed chunks on a block. The slab allocator had a 16-byte chunk header. The changes being made here reduce the chunk header size down to just 8 bytes for all 3 of our memory context types. For small to medium sized allocations, this significantly increases the number of chunks that we can fit on a given block which results in much more efficient use of memory. Additionally, this commit completely changes the rule that pointers to palloc'd memory must be directly prefixed by a pointer to the owning memory context and instead, we now insist that they're directly prefixed by an 8-byte value where the least significant 3-bits are set to a value to indicate which type of memory context the pointer belongs to. Using those 3 bits as an index (known as MemoryContextMethodID) to a new array which stores the methods for each memory context type, we're now able to pass the pointer given to functions such as pfree() and repalloc() to the function specific to that context implementation to allow them to devise their own methods of finding the memory context which owns the given allocated chunk of memory. The reason we're able to reduce the chunk header down to just 8 bytes is because of the way we make use of the remaining 61 bits of the required 8-byte chunk header. Here we also implement a general-purpose MemoryChunk struct which makes use of those 61 remaining bits to allow the storage of a 30-bit value which the MemoryContext is free to use as it pleases, and also the number of bytes which must be subtracted from the chunk to get a reference to the block that the chunk is stored on (also 30 bits). The 1 additional remaining bit is to denote if the chunk is an "external" chunk or not. External here means that the chunk header does not store the 30-bit value or the block offset. The MemoryContext can use these external chunks at any time, but must use them if any of the two 30-bit fields are not large enough for the value(s) that need to be stored in them. When the chunk is marked as external, it is up to the MemoryContext to devise its own means to determine the block offset. Using 3-bits for the MemoryContextMethodID does mean we're limiting ourselves to only having a maximum of 8 different memory context types. We could reduce the bit space for the 30-bit value a little to make way for more than 3 bits, but it seems like it might be better to do that only if we ever need more than 8 context types. This would only be a problem if some future memory context type which does not use MemoryChunk really couldn't give up any of the 61 remaining bits in the chunk header. With this MemoryChunk, each of our 3 memory context types can quickly obtain a reference to the block any given chunk is located on. AllocSet is able to find the context to which the chunk is owned, by first obtaining a reference to the block by subtracting the block offset as is stored in the 'hdrmask' field and then referencing the block's 'aset' field. The Generation context uses the same method, but GenerationBlock did not have a field pointing back to the owning context, so one is added by this commit. In aset.c and generation.c, all allocations larger than allocChunkLimit are stored on dedicated blocks. When there's just a single chunk on a block like this, it's easy to find the block from the chunk, we just subtract the size of the block header from the chunk pointer. The size of these chunks is also known as we store the endptr on the block, so we can just subtract the pointer to the allocated memory from that. Because we can easily find the owning block and the size of the chunk for these dedicated blocks, we just always use external chunks for allocation sizes larger than allocChunkLimit. For generation.c, this sidesteps the problem of non-external MemoryChunks being unable to represent chunk sizes >= 1GB. This is less of a problem for aset.c as we store the free list index in the MemoryChunk's spare 30-bit field (the value of which will never be close to using all 30-bits). We can easily reverse engineer the chunk size from this when needed. Storing this saves AllocSetFree() from having to make a call to AllocSetFreeIndex() to determine which free list to put the newly freed chunk on. For the slab allocator, this commit adds a new restriction that slab chunks cannot be >= 1GB in size. If there happened to be any users of slab.c which used chunk sizes this large, they really should be using AllocSet instead. Here we also add a restriction that normal non-dedicated blocks cannot be 1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB during the creation of an AllocSet or Generation context. Allocations can still be larger than 1GB, it's just these will always be on dedicated blocks (which do not have the 1GB restriction). Author: Andres Freund, David Rowley Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
* Fix the incorrect assertion introduced in commit 7f13ac8123.Amit Kapila2022-08-29
| | | | | | | | | | | | | It has been incorrectly assumed in commit 7f13ac8123 that we can either purge all or none in the catalog modifying xids list retrieved from a serialized snapshot. It is quite possible that some of the xids in that array are old enough to be pruned but not others. As per buildfarm Author: Amit Kapila and Masahiko Sawada Reviwed-by: Masahiko Sawada Discussion: https://postgr.es/m/CAA4eK1LBtv6ayE+TvCcPmC-xse=DVg=SmbyQD1nv_AaqcpUJEg@mail.gmail.com
* Add more detail why repalloc and pfree do not accept NULL pointersPeter Eisentraut2022-08-28
| | | | | | | Per discussion, we choose not to change this. This just gives a little bit more information. Discussion: https://www.postgresql.org/message-id/flat/cf26e970-8e92-59f1-247a-aa265235075b%40enterprisedb.com
* Doc: add comment about bug fixed in back branches as of 3f7323cbb.Tom Lane2022-08-27
| | | | | | | | | While the bug I just fixed in the back branches doesn't exist in HEAD, the requirement that MULTIEXPR SubPlans not share output parameters still does. Add a comment to memorialize that, because perhaps it could be an issue again someday. Discussion: https://postgr.es/m/17596-c5357f61427a81dc@postgresql.org
* Fix typo in comment for writetuple() functionAlexander Korotkov2022-08-27
| | | | | Reported-by: David Rowley Discussion: https://postgr.es/m/CAApHDvrZ9Ky2LcWwcKsbdYChA850JE5qS%3DkGJiTNWS8mbBXZHw%40mail.gmail.com
* Remove unneeded null pointer checks before PQfreemem()Peter Eisentraut2022-08-26
| | | | | | | | PQfreemem() just calls free(), and the latter already checks for null pointers. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/cf26e970-8e92-59f1-247a-aa265235075b%40enterprisedb.com
* Remove unnecessary casts in free() and pfree()Peter Eisentraut2022-08-26
| | | | | Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/cf26e970-8e92-59f1-247a-aa265235075b%40enterprisedb.com
* Remove obsolete commentPeter Eisentraut2022-08-26
| | | | | | | | The comment in basebackup.c updated by 33bd4698c11 was actually obsolete to begin with, since the symbols it was referring to haven't existed in that header file for quite some time. The header file is still needed for other reasons, though, so keep the #include, just drop the comment.
* Fix typo in comment.Etsuro Fujita2022-08-26
|
* Remove configure probe for sockaddr_in6 and require AF_INET6.Thomas Munro2022-08-26
| | | | | | | | | | | | | | | | | SUSv3 <netinet/in.h> defines struct sockaddr_in6, and all targeted Unix systems have it. Windows has it in <ws2ipdef.h>. Remove the configure probe, the macro and a small amount of dead code. Also remove a mention of IPv6-less builds from the documentation, since there aren't any. This is similar to commits f5580882 and 077bf2f2 for Unix sockets. Even though AF_INET6 is an "optional" component of SUSv3, there are no known modern operating system without it, and it seems even less likely to be omitted from future systems than AF_UNIX. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CA+hUKGKErNfhmvb_H0UprEmp4LPzGN06yR2_0tYikjzB-2ECMw@mail.gmail.com
* Small refactor to get rid of -Wshadow=compatible-local warningDavid Rowley2022-08-26
| | | | | | | | | | Further reduce -Wshadow=compatible-local warnings by 1 by refactoring the code in gistRelocateBuildBuffersOnSplit() to make use of foreach_current_index() instead of manually incrementing a variable on each loop. Author: David Rowley Discussion: https://postgr.es/m/CAApHDvpGZX-X=Bn4moyXgfFa0CdSUwoa04d3isit3=1qo8F8Bw@mail.gmail.com
* More -Wshadow=compatible-local warning fixesDavid Rowley2022-08-26
| | | | | | | | | | | | In a similar effort to f01592f91, here we're targetting fixing the warnings where we've deemed the shadowing variable to serve a close enough purpose to the shadowed variable just to reuse the shadowed version and not declare the shadowing variable at all. By my count, this takes the warning count from 106 down to 71. Author: Justin Pryzby Discussion: https://postgr.es/m/20220825020839.GT2342@telsasoft.com
* Allow grant-level control of role inheritance behavior.Robert Haas2022-08-25
| | | | | | | | | | | | | | | | | | | | The GRANT statement can now specify WITH INHERIT TRUE or WITH INHERIT FALSE to control whether the member inherits the granted role's permissions. For symmetry, you can now likewise write WITH ADMIN TRUE or WITH ADMIN FALSE to turn ADMIN OPTION on or off. If a GRANT does not specify WITH INHERIT, the behavior based on whether the member role is marked INHERIT or NOINHERIT. This means that if all roles are marked INHERIT or NOINHERIT before any role grants are performed, the behavior is identical to what we had before; otherwise, it's different, because ALTER ROLE [NO]INHERIT now only changes the default behavior of future grants, and has no effect on existing ones. Patch by me. Reviewed and testing by Nathan Bossart and Tushar Ahuja, with design-level comments from various others. Discussion: http://postgr.es/m/CA+Tgmoa5Sf4PiWrfxA=sGzDKg0Ojo3dADw=wAHOhR9dggV=RmQ@mail.gmail.com
* Remove SUBSYS.o rule in common.mk, hasn't been used in a long timeAndres Freund2022-08-24
| | | | | | | | Apparently I missed that this SUBSYS.o rule isn't needed anymore in a4ebbd27527, likely because there still is a reference to it due to AIX - but that's self contained in src/backend/Makefile Discussion: https://postgr.es/m/20220820174213.d574qde4ptwdzoqz@awork3.anarazel.de
* Remove rule to generate postgres.o, not needed for 20+ yearsAndres Freund2022-08-24
| | | | Discussion: https://postgr.es/m/20220820174213.d574qde4ptwdzoqz@awork3.anarazel.de
* Include RelFileLocator fields individually in BufferTag.Robert Haas2022-08-24
| | | | | | | | | | | | | | This is preparatory work for a project to increase the number of bits in a RelFileNumber from 32 to 56. Along the way, introduce static inline accessor functions for a couple of BufferTag fields. Dilip Kumar, reviewed by me. The overall patch series has also had review at various times from Andres Freund, Ashutosh Sharma, Hannu Krosing, Vignesh C, Álvaro Herrera, and Tom Lane. Discussion: http://postgr.es/m/CAFiTN-trubju5YbWAq-BSpZ90-Z6xCVBQE8BVqXqANOZAF1Znw@mail.gmail.com
* Defend against stack overrun in a few more places.Tom Lane2022-08-24
| | | | | | | | | | | | | | | | | | | SplitToVariants() in the ispell code, lseg_inside_poly() in geo_ops.c, and regex_selectivity_sub() in selectivity estimation could recurse until stack overflow; fix by adding check_stack_depth() calls. So could next() in the regex compiler, but that case is better fixed by converting its tail recursion to a loop. (We probably get better code that way too, since next() can now be inlined into its sole caller.) There remains a reachable stack overrun in the Turkish stemmer, but we'll need some advice from the Snowball people about how to fix that. Per report from Egor Chindyaskin and Alexander Lakhin. These mistakes are old, so back-patch to all supported branches. Richard Guo and Tom Lane Discussion: https://postgr.es/m/1661334672.728714027@f473.i.mail.ru
* Fix ICU locale option handling in CREATE DATABASEPeter Eisentraut2022-08-24
| | | | | | | | | The code took the LOCALE option as the default/fallback for ICU_LOCALE, but this was neither documented nor intended, so remove it. (It was probably left in from an earlier patch version.) Reported-by: Marina Polyakova <m.polyakova@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/f385ba25e7f8be427b8c582e5cca7d79%40postgrespro.ru#515a31c5429d6d37ad1d5c9d66962a1e
* Remove initialization of MyClientConnectionInfo at backend startupMichael Paquier2022-08-24
| | | | | | | | | This stuff should be already initialized at process startup, so adding this extra step is confusing for no gain. Per gripe from Tom Lane and Jacob Champion. Discussion: https://postgr.es/m/bbf2b922-4ff7-5c30-e3ef-2a8bdcdd1116@timescale.com
* Further -Wshadow=compatible-local warning fixesDavid Rowley2022-08-24
| | | | | | | | | | | | | These should have been included in 421892a19 as these shadowed variable warnings can also be fixed by adjusting the scope of the shadowed variable to put the declaration for it in an inner scope. This is part of the same effort as f01592f91. By my count, this takes the warning count from 114 down to 106. Author: David Rowley and Justin Pryzby Discussion: https://postgr.es/m/CAApHDvrwLGBP%2BYw9vriayyf%3DXR4uPWP5jr6cQhP9au_kaDUhbA%40mail.gmail.com
* Allow parallel workers to retrieve some data from PortMichael Paquier2022-08-24
| | | | | | | | | | | | | | | | | | | | | | | | | This commit moves authn_id into a new global structure called ClientConnectionInfo (mapping to a MyClientConnectionInfo for each backend) which is intended to hold all the client information that should be shared between the backend and any of its parallel workers, access for extensions and triggers being the primary use case. There is no need to push all the data of Port to the workers, and authn_id is quite a generic concept so using a separate structure provides the best balance (the name of the structure has been suggested by Robert Haas). While on it, and per discussion as this would be useful for a potential SYSTEM_USER that can be accessed through parallel workers, a second field is added for the authentication method, copied directly from Port. ClientConnectionInfo is serialized and restored using a new parallel key and a structure tracks the length of the authn_id, making the addition of more fields straight-forward. Author: Jacob Champion Reviewed-by: Bertrand Drouvot, Stephen Frost, Robert Haas, Tom Lane, Michael Paquier, Julien Rouhaud Discussion: https://postgr.es/m/793d990837ae5c06a558d58d62de9378ab525d83.camel@vmware.com
* Further reduce warnings with -Wshadow=compatible-localDavid Rowley2022-08-24
| | | | | | | | | | | | | | | | | | | | | | | In a similar effort to f01592f91, here we're targetting fixing the warnings that -Wshadow=compatible-local produces that we can fix by moving a variable to an inner scope to stop that variable from being shadowed by another variable declared somewhere later in the function. All of the warnings being fixed here are changing the scope of variables which are being used as an iterator for a "for" loop. In each instance, the fix happens to be changing the for loop to use the C99 type initialization. Much of this code likely pre-dates our use of C99. Reducing the scope of the outer scoped variable seems like the safest way to fix these. Renaming seems more likely to risk patches using the wrong variable. Reducing the scope is more likely to result in a compilation failure after applying some future patch rather than introducing bugs with it. By my count, this takes the warning count from 129 down to 114. Author: Justin Pryzby Discussion: https://postgr.es/m/CAApHDvrwLGBP%2BYw9vriayyf%3DXR4uPWP5jr6cQhP9au_kaDUhbA%40mail.gmail.com
* Remove our artificial PG_SOMAXCONN limit on listen queue length.Tom Lane2022-08-23
| | | | | | | | | | | | | | | | I added this in commit 153f40067, out of paranoia about kernels possibly rejecting very large listen backlog requests. However, POSIX has said for decades that the kernel must silently reduce any value it considers too large, and there's no evidence that any current system doesn't obey that. Let's just drop this limit and save some complication. While we're here, compute the request as twice MaxConnections not twice MaxBackends; the latter no longer means what it did in 2001. Per discussion of a report from Kevin McKibbin. Discussion: https://postgr.es/m/CADc_NKg2d+oZY9mg4DdQdoUcGzN2kOYXBu-3--RW_hEe0tUV=g@mail.gmail.com
* Doc: prefer sysctl to /proc/sys in docs and comments.Tom Lane2022-08-23
| | | | | | | | | sysctl is more portable than Linux's /proc/sys file tree, and often easier to use too. That's why most of our docs refer to sysctl when talking about how to adjust kernel parameters. Bring the few stragglers into line. Discussion: https://postgr.es/m/361175.1661187463@sss.pgh.pa.us
* Add CHECK_FOR_INTERRUPTS while decoding changes.Amit Kapila2022-08-23
| | | | | | | | | | | | | While decoding changes in a loop, if we skip all the changes there is no CFI making the loop uninterruptible. Reported-by: Whale Song and Andrey Borodin Bug: 17580 Author: Masahiko Sawada Reviwed-by: Amit Kapila Backpatch-through: 10 Discussion: https://postgr.es/m/17580-849c1d5b6d7eb422@postgresql.org Discussion: https://postgr.es/m/B319ECD6-9A28-4CDF-A8F4-3591E0BF2369@yandex-team.ru
* Remove redundant call to pgstat_report_wal()Andres Freund2022-08-22
| | | | | | | | | | | pgstat_report_stat() will be called before shutdown so an explicit call to pgstat_report_wal() just before shutdown is redundant. This likely was not redundant before 5891c7a8ed8, but now it clearly is. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/CAAKRu_aaq33UnG4TXq3S-OSXGWj1QGf0sU%2BECH4tNwGFNERkZA%40mail.gmail.com
* Add BackendType for standalone backendsAndres Freund2022-08-22
| | | | | | | | | | | | | All backends should have a BackendType to enable statistics reporting per BackendType. Add a new BackendType for standalone backends, B_STANDALONE_BACKEND (and alphabetize the BackendTypes). Both the bootstrap backend and single user mode backends will have BackendType B_STANDALONE_BACKEND. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/CAAKRu_aaq33UnG4TXq3S-OSXGWj1QGf0sU%2BECH4tNwGFNERkZA%40mail.gmail.com
* pgstat: Acquire lock when reading variable-numbered statsAndres Freund2022-08-22
| | | | | | | | | | | | | | Somewhere during the development of the patch acquiring a lock during read access to variable-numbered stats got lost. The missing lock acquisition won't cause corruption, but can lead to reading torn values when accessing stats. Add the missing lock acquisitions. Reported-by: Greg Stark <stark@mit.edu> Reviewed-by: "Drouvot, Bertrand" <bdrouvot@amazon.com> Reviewed-by: Andres Freund <andres@anarazel.de> Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/CAM-w4HMYkM_DkYhWtUGV+qE_rrBxKOzOF0+5faozxO3vXrc9wA@mail.gmail.com Backpatch: 15-
* Switch format specifier for replication origins to %dJohn Naylor2022-08-23
| | | | | | | | | Using %u with uint16 causes warnings with -Wformat-signedness. There are many other warnings, but for now change only these since c920fe4818 already changed the message string for most of them. Per report from Peter Eisentraut Discussion: https://www.postgresql.org/message-id/31e63649-0355-7088-831e-b07d5f908a8c%40enterprisedb.com
* Remove empty statementJohn Naylor2022-08-23
| | | | | | Peter Smith Discussion: https://www.postgresql.org/message-id/CAHut%2BPtRGVuj8Q_GpHHxZyk7fGwdYDG8_s4GSfKoc_4Yd9vR-w%40mail.gmail.com
* Make role grant system more consistent with other privileges.Robert Haas2022-08-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, membership of role A in role B could be recorded in the catalog tables only once. This meant that a new grant of role A to role B would overwrite the previous grant. For other object types, a new grant of permission on an object - in this case role A - exists along side the existing grant provided that the grantor is different. Either grant can be revoked independently of the other, and permissions remain so long as at least one grant remains. Make role grants work similarly. Previously, when granting membership in a role, the superuser could specify any role whatsoever as the grantor, but for other object types, the grantor of record must be either the owner of the object, or a role that currently has privileges to perform a similar GRANT. Implement the same scheme for role grants, treating the bootstrap superuser as the role owner since roles do not have owners. This means that attempting to revoke a grant, or admin option on a grant, can now fail if there are dependent privileges, and that CASCADE can be used to revoke these. It also means that you can't grant ADMIN OPTION on a role back to a user who granted it directly or indirectly to you, similar to how you can't give WITH GRANT OPTION on a privilege back to a role which granted it directly or indirectly to you. Previously, only the superuser could specify GRANTED BY with a user other than the current user. Relax that rule to allow the grantor to be any role whose privileges the current user posseses. This doesn't improve compatibility with what we do for other object types, where support for GRANTED BY is entirely vestigial, but it makes this feature more usable and seems to make sense to change at the same time we're changing related behaviors. Along the way, fix "ALTER GROUP group_name ADD USER user_name" to require the same privileges as "GRANT group_name TO user_name". Previously, CREATEROLE privileges were sufficient for either, but only the former form was permissible with ADMIN OPTION on the role. Now, either CREATEROLE or ADMIN OPTION on the role suffices for either spelling. Patch by me, reviewed by Stephen Frost. Discussion: http://postgr.es/m/CA+TgmoaFr-RZeQ+WoQ5nKPv97oT9+aDgK_a5+qWHSgbDsMp1Vg@mail.gmail.com
* Fix assertion failure in CREATE DATABASEPeter Eisentraut2022-08-22
| | | | | | | | | An assertion would fail when creating a database with libc locale provider from a template database with icu locale provider. Reported-by: Marina Polyakova <m.polyakova@postgrespro.ru> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/f385ba25e7f8be427b8c582e5cca7d79%40postgrespro.ru#515a31c5429d6d37ad1d5c9d66962a1e
* Use logical operator && instead of & in vacuumparallel.c.Amit Kapila2022-08-22
| | | | | | | | | | | As such the current usage of & won't produce incorrect results but it would be better to use && to short-circuit the evaluation of second condition when the same is not required. Author: Ranier Vilela Reviewed-by: Tom Lane, Bharath Rupireddy Backpatch-through: 15, where it was introduced Discussion: https://postgr.es/m/CAEudQApL8QcoYwQuutkWKY_h7gBY8F0Xs34YKfc7-G0i83K_pw@mail.gmail.com
* Remove shadowed local variables that are new in v15David Rowley2022-08-20
| | | | | | | | | | | | | | | | | | | | | | | | | Compiling with -Wshadow=compatible-local yields quite a few warnings about local variables being shadowed by compatible local variables in an inner scope. Of course, this is perfectly valid in C, but we have had bugs in the past as a result of developers failing to notice this. af7d270dd is a recent example. Here we do a cleanup of warnings we receive from -Wshadow=compatible-local for code which is new to PostgreSQL 15. We've yet to have the discussion about if we actually ever want to run that as a standard compilation flag. We'll need to at least get the number of warnings down to something easier to manage before we can realistically consider if we want this or not. This commit is the first step towards reducing the warnings. The changes being made here are all fairly trivial. Because of that, and the fact that v15 is still in beta, this is being back-patched into 15. It seems more risky not to do this as the risk of future bugs is increased by the additional conflicts that this commit could cause for any future bug fixes touching the same areas as this commit. Author: Justin Pryzby Discussion: https://postgr.es/m/20220817145434.GC26426%40telsasoft.com Backpatch-through: 15
* Avoid reltuples distortion in very small tables.Peter Geoghegan2022-08-19
| | | | | | | | | | | | | | | | | | | | | | | Consistently avoid trusting a sample of only one page at the point that VACUUM determines a new reltuples for the target table (though only when the table is larger than a single page). This is follow-up work to commit 74388a1a, which added a heuristic to prevent reltuples from becoming distorted by successive VACUUM operations that each scan only a single heap page (which was itself more or less a bugfix for an issue in commit 44fa8488, which simplified VACUUM's handling of scanned pages). The original bugfix commit did not account for certain remaining cases that where not affected by its "2% of total relpages" heuristic. This happened with relations that are small enough that just one of its pages exceeded the 2% threshold, yet still big enough for VACUUM to deem skipping most of its pages via the visibility map worthwhile. reltuples could still become distorted over time with such a table, at least in scenarios where the VACUUM command is run repeatedly and without the table itself ever changing. Author: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/CAH2-Wzk7d4m3oEbEWkWQKd+gz-eD_peBvdXVk1a_KBygXadFeg@mail.gmail.com Backpatch: 15-, where the rules for scanned pages changed.
* Initialize index stats during parallel VACUUM.Peter Geoghegan2022-08-18
| | | | | | | | | | | | | Initialize shared memory allocated for index stats to avoid a hard crash. This was possible when parallel VACUUM became confused about the current phase of index processing. Oversight in commit 8e1fae1938, which refactored parallel VACUUM. Author: Masahiko Sawada <sawada.mshk@gmail.com> Reported-By: Justin Pryzby <pryzby@telsasoft.com> Discussion: https://postgr.es/m/20220818133406.GL26426@telsasoft.com Backpatch: 15-, the first version with the refactoring commit.
* Ensure that pg_auth_members.grantor is always valid.Robert Haas2022-08-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, "GRANT foo TO bar" or "GRANT foo TO bar GRANTED BY baz" would record the OID of the grantor in pg_auth_members.grantor, but that role could later be dropped without modifying or removing the pg_auth_members record. That's not great, because we typically try to avoid dangling references in catalog data. Now, a role grant depends on the grantor, and the grantor can't be dropped without removing the grant or changing the grantor. "DROP OWNED BY" will remove the grant, just as it does for other kinds of privileges. "REASSIGN OWNED BY" will not, again just like what we do in other cases involving privileges. pg_auth_members now has an OID column, because that is needed in order for dependencies to work. It also now has an index on the grantor column, because otherwise dropping a role would require a sequential scan of the entire table to see whether the role's OID is in use as a grantor. That probably wouldn't be too large a problem in practice, but it seems better to have an index just in case. A follow-on patch is planned with the goal of more thoroughly rationalizing the behavior of role grants. This patch is just trying to do enough to make sure that the data we store in the catalogs is at some basic level valid. Patch by me, reviewed by Stephen Frost Discussion: http://postgr.es/m/CA+TgmoaFr-RZeQ+WoQ5nKPv97oT9+aDgK_a5+qWHSgbDsMp1Vg@mail.gmail.com
* Improve performance of adjust_appendrel_attrs_multilevel.Tom Lane2022-08-18
| | | | | | | | | | | | | | | | | | | | | | | | The present implementations of adjust_appendrel_attrs_multilevel and its sibling adjust_child_relids_multilevel are very messy, because they work by reconstructing the relids of the child's immediate parent and then seeing if that's bms_equal to the relids of the target parent. Aside from being quite inefficient, this will not work with planned future changes to make joinrels' relid sets contain outer-join relids in addition to baserels. The whole thing can be solved at a stroke by adding explicit parent and top_parent links to child RelOptInfos, and making these functions work with RelOptInfo pointers instead of relids. Doing that is simpler for most callers, too. In my original version of this patch, I got rid of RelOptInfo.top_parent_relids on the grounds that it was now redundant. However, that adds a lot of code churn in places that otherwise would not need changing, and arguably the extra indirection needed to fetch top_parent->relids in those places costs something. So this version leaves that field in place. Discussion: https://postgr.es/m/553080.1657481916@sss.pgh.pa.us
* Adjust assertion in XLogDecodeNextRecord.Robert Haas2022-08-18
| | | | | | | | | | | | | | | | | As written, if you use XLogBeginRead() to position an xlogreader at the beginning of a WAL page and then try to read WAL, this assertion will fail. However, the header comment for XLogBeginRead() claims that positioning an xlogreader at the beginning of a page is valid, and the code here is perfectly able to cope with it. It's only the assertion that causes trouble. So relax it. This is formally a bug in all supported branches, but as it doesn't seem to have any consequences for current uses of the xlogreader facility, no back-patch, at least for now. Dilip Kumar and Robert Haas Discussion: http://postgr.es/m/CA+TgmoaJSs2_7WHW2GzFYe9+zfPtxBKvT3GW47+x=ptUE=cULw@mail.gmail.com